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INTRODUCTION 


In all sorts of experiments which are not simple repetitions but have at least 
one varying essential circumstance or indefinite variate the experimentalist is 
confronted with a choice in regard to the values of that variate. If the ex- 
periments be quite simple the question may be without great importance; but 
when their requirements as to time or expenditure come into account the problem 
arises, how the observations should be chosen in order that a limited number of 
them may give the maximum amount of knowledge. It clearly depends upon the 
relationship between the observed quantity, which we shall name the primary 
variate, and its essential circumstances, the secondary variates, and upon the 
variation of the errors of the observations. 
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2 Choice in the Distribution of Observations 


When we deal with, for example, a linear function which it is possible to ob- 
serve with the same accuracy for all values of the indefinite variate we should — 
not hesitate to put the observations in two equally big groups as far apart from 
each other as feasible. But if the standard deviation of the observations be a 
function of the indefinite variate and increases with the distance from the middle 
of the range, where is then the point in which the advantage of removing the two 
groups of observations from each other just counterbalances the disadvantages of 
increasing the error of observations? The problem becomes very complicated for 
functions of higher degrees. 

We shall in this memoir try to contribute to the solution in the case of poly- 
nomial functions by examining the standard deviations of the adjusted and more 
especially the interpolated values of such functions for different distributions of 
observations. Those values inside the working range of observations may be 
considered the sum of knowledge acquired by the experiments. The adjusted 
values outside the working range may probably in exceptional cases be of interest, 
but as only by some other type of experiment we can make sure that the form of 
function holds outside the range they are in ordinary cases without great value. 
We shall therefore aim at finding the distribution of observations which within 
the selected range gives the most satisfactory standard deviations of the adjusted 
values of the function. 

To consider the standard deviations satisfactory we must of course demand 
that they shall be as small as possible, and since a greater accuracy in one part 
may be expected to be accompanied by a smaller accuracy in another part we 
want them in addition to be as near constant as possible. In other words the 
curve of standard deviation with the lowest possible maximum value within the 
working range of observations is what we shall attempt to find. It appears that 
the distribution of observations which fulfils this demand consists of specially placed 
groups in number just sufficient to determine the constants of the function. We 
shall accordingly pay attention also to the desirability usually present of ascer- 
taining the form of function by means of the observations. As might be expected 
we find that the standard deviations obtained from a uniform continuous distri- 
bution of observations increase towards the ends of the range. By choosing a 
uniform continuous distribution with additional clusters at the ends of the range 
we shall try to find a compromise between the two desiderata of a low maximum 
of standard deviation and of a uniform distribution. 

The indefinite variate is supposed to have a vanishing error of observation 
compared with that of the principal variate. This error may be constant or varying 
with the indefinite variate, but in either case it is supposed to follow the typical 
law so closely that the method of least squares may satisfactorily be applied to the 
observations. After having found first the most advantageous distributions for 
observations of functions up to the sixth degree with constant standard devia- 
tions we examine the case for observations of functions of the first and of the 
second degree which have standard deviations of the form o (1 + ax) and o (1 + az?). 
If it is profitable to use the whole of the working range the latter distributions 
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are practically found from the former by multiplying their frequencies by the squared 
standard deviations of the observations at the corresponding place. But in cases 
where extrapolation is of advantage, and the whole range therefore not to be used, 
the law of the frequencies has to be examined anew. 

In Section VIII we find for the same two cases of varying error of observa- 
tion the distributions which make each single constant of a function of the first 
and of the second degree a minimum. 


I. Adjustment of a polynomial function of one variable ; general distribution 
of observations. 
(Dy La iy Opaceseee Tsar yw be N observations of a function of nth degree 
taken at the points 7, 2 ...... Le tae ee Ly, 
Y =A) + a2 + agx? 
Let us assume that from earlier experience we know the standard deviation of an 
observation of y to be o Vf (x). The method of least squares will then give us the 
following system of normal equations in which the sums are to be extended over 
all the observations: 
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If f(x) is 1 the sums are the moment coefficients of the places of observations 
multiplied by N, and in the general case we shall for brevity put 
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which determines the adjusted y corresponding to the sae a. 
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(2) To find the standard deviation o,, of an adjusted y, it will be easiest to 
start from the equations (2). If the first be multiplied by ay, the second by a, and 


so on before summing, and if we choose ap, a4 .....- a, so that 
ApM + a,m, j+4,.M, +...... + a,m, = 
Chin =P Onn ar Cais |SIP aapccd + On Mn+ = Lp 
2 
Cie a Onley Senn, ~ SP ceocox TF On Mnte, =O. (4), 
Oy My + Oy Myry + Ae Mnts 1 w.cee OnMon = | 
find that y,—+,8 442 [a ” 
we fin a Yr = a} F( Reprexonay te Cit re nagan “tin Dip lies 
Ly) 
2 o ( 1 2 n7q2 
and therefore o? = = Siz [ay + a1%p + 2% + 0... + Gn&p lone 
TNE ie) 
By multiplying out the square this may be written 
5 
2 OF , 
Gio {a9 [apm +a,m, +4.M. + .....- +AnM%, | 
+0, [a9 +0, NOs is eaieesrse + On Mn+4] 
+ a. [agM, + a,M, +a,M_ +... + On Mn+>| 
+ ay [agM%y + ay Mgt] + by Mato vee + dnMen|}, 
9 
: 5 Os 2 n 
or applying (4) or = W (agen ope agate nee FG.) score eae (5). 
° Q . 5 FB . . 
Hence a; is found by elimination of the a’s between (4) and (5), which results in 
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This determinant is of fundamental importance for all the following work and 
it will be useful at once to examine it more closely. 


(3) First however it may be pointed out that the standard deviation of any 
other linear function b = body + ba, + B24, 4+ ...... b,Ay 


of the constants of the function y may be determined in quite the same way by 
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In particular Gas is found from 
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(4) Let us call a determinant, identical with that of (6) except that it has 0 
instead of the element Ue 2 , A, let A,., be its minor not containing the rth row 


and sth column, again let A,,y,, be the minor of this not containing the pth 
row and the gth column of A. We then find from (8) 


2 0 Avy, 942, 161 
ps TS a TE a (9). 
With this notation we obtain from (6) 
peomiGe A 
Meare) (10). 


In the following we shall drop the index r and indicate by ,o, the standard 
deviation of a y adjusted by means of a function of the nth degree. 


_ If we were dealing with a function of (n — 1)st degree and retained the observa- 
tions distributed as before we should find 
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of a function of x. In the same way we can express ,_,0,, — ,-go, and thus further 
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It will be seen that the squared standard deviation of an adjusted y vs a function of 
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the 2nth degree of x. The coefficient of x2” is the square of —**®"*®*+ which, as 
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g;,,,> 1b is therefore positive and can never vanish. 


was just seen, is the factor with which 


(5) If all the m’s with odd indices are zero it is seen from (6) that a, is a function 
of a. This is, at least in theory, a natural thing to aim at, since our general 
purpose is to find a curve for o;, giving as nearly as possible a constant value for 
a, throughout the range. 


Rearranging the order of rows and columns in (6) we get, when all m,,,; = 0 
and n = 2p, 
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For a function of the degree 2p — 1 we get the same determinant as in (12) 
except that it does not contain the row and column in which 2” is found. 


Hence we find 
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(6) The last two determinant ratios of (13) and (14) are identical, and when 
the numerator of the first fraction of (13) is indicated by 8 we therefore find 
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Comparing eo) and oie e we see that they have the first determinant ratio 
in common and that when y stands for the numerator of the other fraction of 
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The general formula (11) hence for ANY M441 = 0 takes the shape 
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(7) Before leaving the general case and treating special distributions of 
observations three auxiliary propositions shall be proved. We shall first prove that 


2 . 2 ] . 9 is 
the curve of ,o,, can never be entirely below a , “. . With that purpose ,o,, will 
0 
be summed over all the places of observation with the weight . i le. for a 
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the number of observations, will be integrated over the range of observations. 


continuous distribution of observations, the expression 


Looking first at the numerator of the last term of (11) we find that it can be 
expanded into 


MGT weoece 74 | x Wy Or “aoemae Mn—1 


(= a, a2 mM lm copoos lea 


n 
CE Ge MeN Ga cose nn sy 
Nicene Maes Van ie va My | 
Too 5 
Kar Nog ice oe ee + | a Tips. “CLS, SO Pee tales Atma, m421f + 
ETL On vee feces Mon—1 


T 
Now / ay dx integrated over all the observations is what we have called 
N.m,. When integrating the determinants we therefore find that the first n of 
them will vanish, two of their columns consisting of proportional elements, whereas 


the integral of the last determinant is 


Ules iliy> SEG aoe Mey 
NO ee Mao. MN, 


NE ecemnyote A105 Beets uly, ce (== (= 1)” NAG 4: 


. . | 


Us UDP ALIS Rea Magn 
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AS Aj n+4o,n+2,1 = — Anse,nt21,1, the integral of the last term of (11) equals WV. 
The integration of the other terms, He the first, gives the same result so that 


2b (x 2 
[no ae ra) de = 0 (n+ 1), 


ib (x) 
and as [eo dx = Nino, 
If (2) ; 
the mean value of ,,o, calculated in this special way is 
Noe) 
INS he 
It is therefore clear either that ,o, must at all the places of observation be 
‘ 1 2 
equal to idl SEALE n@, must at some of these places be greater. The first case 


N° m 
cannot be realised by a distribution of which any part is continuous, as ,0, is proved 
to be of the 2nth degree in x. If therefore we could find a distribution consisting 
of groups of observations for which at all the places of observation ,,o,, was equal 
z 1 
to x : "a, and if further we could choose the places of observation so that ,.o, 
0 
at all other places within the range of observations was smaller than that value, 
we should know that no other distribution of observations with that value for my 
could provide a curve of standard deviation with a lower maximum. 
If the standard deviation of the observations be constant and equal o, f (x) 
equals 1, and so does m,. After what we have just proved the maximum of the ,,o;, 


curve cannot then be lower than a (n+ 1). Now when we choose to distribute our 


N observations in (n+ 1) equally big groups the adjusted y at each of these (n+ 1) 
as will be the mean of the observations and its squared standard deviation will 
be 2 V at +1). Hence our problem is reduced to find out how to arrange a table of 
(n + 1) values of a function of the nth degree to make the squared standard deviation 
of any interpolation result inside the range smaller than the squared standard 
deviation of the values of the table. It will be seen in what follows that this can 
up to n equal 6—that is so far as the problem here has been investigated—be 
obtained by one and only one form of grouping. 

When the standard deviation of the observations varies over the range, mo 
varies with the different distributions, and we cannot use the same method for 
finding the best distribution. It even appears that the best distribution has not 
always its maxima at the places of observation. 


(8) A second problem which we want to consider here is the condition for two 
adjusted y’s being uncorrelated. In the beginning of this section it has been shown 
that the adjusted y, 

1 


eae er Yo 2 Wv 
= nv {42 [Og te Vicia Der anie estos + Op By lL 
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when 
AgMo + ay My, + Ao Moy + eeeeee -+- An Mn = 1 
Ap + 44M +09M3 + -rreee + OnMnsy = 3 
a 
GpMs + O,M5 + ay7%, + oc... oT EON Ur Minin op ait tine vintseis of (16). 
Tt 
Ag Mn + Oy Myyy + AgMnyg + ceeeee + On,Mon = 5 


Let y, be another adjusted value, then 


ening a: 2 n 
Ys = N S {7 (tp) [vo ae Yip ar Y2%p ae oat Ly Yn&p , 


where 
Yom £11 + eM + «es aie aaa 
Yom + 1M, +Y¥eM3 + --eee VnMny = _ 
YoMg +YV1Ms LF YoMqg bt eveeee BEN Un hep al cca see biess (17). 
YoMn + YiMnsza + Y2Mnze T veers + YnMon = 2; 


Hence the condition that y, and y, are uncorrelated is, since the squared standard 
deviation of the observed y, equals o* f (x,), 


E 2 VW 9 es 
S ann flag Fy By + Ay Ly + reese + On @y] - [Yo + Y1Ep + SYSOP ha rate ire ac + yn] = (i 
D 
or S ae ) [ay + ayXp + Cate Spo uvee es + an, ; 
Dp 


a {ee 7 aatty + any + ney boven + ant | 
Dp 
=F S {ore lag X; ar a, 2, sr pt, See Ea ae ana see 
p 


a S {3% [aoa oF ae "* a Gata Cee a ana )} =) 


q 
Remembering that S | eat = Nm, and applying the relations (16) this re- 
duces to : 
Yo + Vite Ya%n +--+ + Yn%, = 0, 
from which the y’s are eliminated by (17). 


2 n 
On! oe The ese e sg 
Ne Oh Mixtecs. Mn 
| ee 21 Wes eae one re 
3 == eee ey rete (18) 
Oe ip LO Fave iinod: Minis 
1 
sng MTU maT Ay sec tac Msn 


is therefore the condition that y, and y, are uncorrelated. 
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(9) Returning to the formula (11) for o, written as a sum of squares we shall 
now prove that the (p + 1)st term of this put equal to zero determines a set of p abscissae 
the adjusted y’s of which are mutually uncorrelated both for a function of the pth and 
the (p — 1)st degree. 


The condition for y, and y, corresponding to the arguments x, and z, being 
uncorrelated is for a function of the (p — 1)st degree 


2 p-l 
0 1 Ly Gta cession’ Li 

1 Hs: Wry Wes aeecec Mp1 
Lo Ui) Ae HES soba My 0 
2 ae 

Ly Ma Mg Mg cnt Most 

=f } 
Tee 1 Se Nn IN aoe Nine 


and for the same distribution of observations and for a function of the pth degree 
the condition is 


"| ) 
yO) ll Ly Cesare Li 
| Ll mM my ULB © ceoanee Mp 
Puen Wie Mil. o6occ0 Men 
| 5 = OR 
CRM Tee TH | (LOR eet Mp0 
| : ; ‘ 
cee 
| edly = 20900 = 00) iat Meng. ea weenie Mion 
Putting | My My, HORNY eniSsoc My 
| 
My Ms WOES. Poa Mons 
Mog Me Ma wee eee My+9 i 1D), 
| "May “Mera Nearer enee Nine 
these conditions may be written 
ed ee 
? ‘ 
x {ay oo Dosa prises (19) 
( 
Pp T Ss 
and DE {Gj bp Dat peat Ow oeaee ss Seu eee (20), 
0 


where the sums include all combinations of powers with 7 and s lying between 0 
and (p—1), and 0 and p respectively. 


Now we have for an orthosymmetrical determinant A, 
aN 5 Ay Siam AN 3 aN Scary Ass 0 Age’. 


If therefore (19) is multiplied by D and subtracted from (20) multiplied by 


D411, p41 the coefficient of x, . 2, becomes 


Do41, 041+ Drta,sia — D . Dost, 941, 241.841 = Dy s3, 041+ Dost, st 
as long as both r and s are smaller than p. 
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When one of them, for example s, equals p the term is 
B® - Dysy, 41+ Dray, ot 
which is of the same form and this also holds for 7 = s = p when the term is 
Te TD Pe 
The total result is thus 


‘3 
Ur Ss 

ie . Go Do41, 741 . Dore = 0, 

) 


or in the form of determinants 


tL Wiley On Hrs, BORaoe Te Ne ie Ue Tt) = ee 0, OE ometicdhOrs pee 
Cen. Ms ey -eoeree My | | ish Up og mL eae LO eee My 
Menai IN, Niger: Osa .| Ue Ta ag re Myr = 0. 
. . . . | | : . | 

ATOMS 0 Miao «<0. Hip FI A all Ng SLO 0 SOR 7 Re Nain 
Hence z, and x, must be roots of 

1M, - ™, Mgt seeae M1 

CaM Ma. ie Mig at. csc: My 

eM, Ms Ma reeves Mina | SSO)" gosedodoaoasaancoe (21). 

Ce? thi Uliana WOrgipy cacccs Mo p—4 


When z, is found from this and substituted in (19) or (20) we get since the 
coefficient of x} in the latter is zero an equation of the (p — 1)st degree to deter- 
mine x. It is therefore clear that any pair of roots of (21) determine a pair of 
uncorrelated y’s. 


II. The “best” grouping of observations with constant standard deviation. 

(1) It was shown in the last section under (7) that the mean of the squared 
standard deviations of the adjusted y taken over the places of observation and weighted 
o2 
N 
fore the curve of squared standard deviation can never be entirely below that value. And 
further, that since (n + 1) equally big groups of observations at the places of 
observations give the squared standard deviation this minimum, there is the 
possibility, ,o7, being of the 2nth degree in x, that by placing the groups at special 
2 


with the number of observations at each place is equal to =, (n+ 1) and that there- 


positions the curve of squared standard deviation could have those values 7 (n+ 1) 
as its maxima within the range of observations. 
Let %1, Uz... Up --. Un4, be the places of observations and Y», the mean of the 
observations at z,, the interpolation formula of Lagrange is then 
y= 5 {eo reese + (% = Ln43) 7 
(pay) (a — U5) (Gy oe a ae 


the sum taken over all the places of observation. 
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From this we find 
2 oO (2 2) (CO — eee (4m a5) Ve 
o,==(n+1 Bie — 2 BTM) ceweennes 22), 
rH? Ge, 1) Gm) Ga) a! 
o2 
N 
and the (n + 1)st taking the value 1 as it ought to. If x, be the greatest of the 
x’s it is hence clear that for x > z,, since 


wich for? —12,), 625440, 47 equals 


(G04) (G25) a CSc ee 
Genes wee ea 
a = (n+ 1) 


The same applies to any x smaller than the smallest of the places of observation. 


2 
Therefore as we want o, to be = = (n + 1) at the ends of the range we have to place 


two of our groups of observations there. 


Let us take the half of the range within which it is possible to make observations as 
the unit of « so that the range goes from — 1 to 1. 


(2) Hence for a linear function there is no choice left, the two groups of observa- 
tions must be at — | and 1. 


According to (22) we have 


526 ic +1? (@— v} 


iS Nes 4 4 
> oF 
or 19, = yw 2tl— 2 (1 — 2h}, 


which illustrate the well-known fact that by simple interpolation between two 
equally good values of a table, we obtain interpolated values with less probable error 
than those of the table. 


(3) Investigating a function of the second degree we have a third group to 
place besides the two at — 1 and 1, that is if we do not beforehand suppose the 
distribution to be symmetrical. Let the third group be at a, then the interpolation 
gives 


— 1)\(e—a)= 1 —a)_ Be It 
Cae ee 5 Viet ee 


from which ‘ A : 5 2 2 
ab Railarre ral cues) >| sa) | 


We want this to be a maximum for «=a, but (=) can only vanish for 


a = 0, in which case a, is reduced to 


2 
oO 


20, = wy otl— 22? (1 — 2*)}, 


(n+ 1), the n terms of the sum being zero 
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which shows that we have succeeded in making o, a maximum at «=0 and 
obtained a standard deviation with the maximum value - 3, as we desired. 
(4) For a function of the third degree we find from four groups of observations 
at —1, 1, a and y that 
(@-1)(@-a)(@-y) | @+1)@-a) @-»), 
=2(1Fa)I+y %*"  2-a—-y 


a aa ee sien C meee ny 


ae (I +a) (1+) doa 
Bi & —1)(@- ay fe SiG = a 
CS ac), A= \ie=%) 
The condition (=) =) 
AX J pa 
requires 3a% — 2ay —1l= 
and (=) =) 
OG Jinay 
requires By? — 2ay —1=0, 
from which is got a? = », 
and, since a 2 y, a= yal, 


By introducing this value for a? and y? in a, we find 
: 3 : 52 
pay AL ae we rd ah, 


which has the required maxima at + 4/1. 


(5) For the functions of higher degree we shall at once assume that the dis- 
tributions sought are symmetrical, since it is pretty clear from the symmetry of 
y and o, with regard to the sought positions that it must be so. 

To determine a function of the fourth degree let us put groups of observations 
at +1,+aand 0. The expression for o%, can be written down at once and is such 
that the terms arising from the groups at +1 and —1 can be put together as well 
as the terms from + a and — a, then 


re C- (2? — 1)\(x?—a?))2 1 (x(a? —a?))2 ple? 1 es 
oy | os rt (ame eee ee 
do’, = ine Eee See ee ; 
(F yee = 0 provides the condition =a O or 


a= 


with which value the squared standard deviation becomes 
2 2 
10, = = 5 {1 = ce a2 (a? — #)2 (1 — Oe 


which has the required characteristics. 
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(6) Adjusting by a function of the fifth degree six equally big groups of obser- 
vations at the arguments + 1, +a and +y the squared standard deviation of the 


adjusted y is 


salar mace > YP ae ji Lea 
‘GlGoaom] 9a acae@an] re 


lorie ee 


The condition for maximum at 7 = +a is 
9at — 5a?y? — 5a? + y? = 0, 
which together with the condition for maximum at x= + y 
Gy* — da2y? — dy? + a? = 0, 


since a? must be 2 y? results in 


a? +y27=% and ay?=— 
a2 7 SIE 2/7 
or i = 2] 


When these values are substituted in the expression above for a, this may by 
somewhat lengthy algebraic operations be brought into the form 
iy AT ce 3°.5. 7? | 2)2 (42 2)2 (J 2 
= Fy OL ay (at = at (2 — 728 (1 —2)h 
(7) For a function of the sixth degree the observations may be supposed to 
be at +1, +a, +y and 0. 
The expression for the squared standard deviation of an adjusted y becomes 


feta ath It An re ee 


5 


a . OF 7! I Sie Ve) Ae 


Sy” | ary? : 2 a2 (a? — y?) (a2 — 1 
1 | x (x? — a?) (x2 — 1) |? 1 | w (a2 — a?) (a? — y?) |? 
laa) + +3| ean | +O} 
A maximum at x= + a requires 
llat — Ta? y? — Ta? + 3y? = 0, 
and a maximum at «= + y requires 
Llyt lary? = Ty? 3-307 = 0; 
which added and subtracted provide 
11 (a? + y?)? — 36a? y? — 4 (a? + y?) = 
and ae 2) atts) Oh a0 


Since we must have a? < y’?, 


a+ y2=109 and a?y?= 4, 
Ci 15 + 2/15 15 
or ye oe 
The expression for a, may after rather laborious operations be brought into the 


form ye if GE Ue ohh 2 2)2 (2 2)2 2 
= ses —~—— «7 (x? — a”)? (@ AR) 
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(8) Itis thus, as we aimed at, shown for functions up to the sixth degree that 
by distributing the observations in (n + 1) equally big groups and choosing the places 
of these groups in one special way we can manage to keep the standard deviation of any 
adjusted y within the possible range of observations less than the standard deviation at 
the places of observation. There is every reason to believe that the rule holds for 
any degree of function, but as the general proof would be very complicated and as 
almost all practical cases will be covered by functions up to the sixth degree, the 
problem can therefore be left at this stage. 

As we have proved, any other distribution of observations leads to a curve of 
squared standard deviation that has a higher maximum value within the range. This 
special set of (n + 1) groups has therefore a very conspicuous advantage over all 
other distributions of observations. The application of it is however limited in that 
at demands that the degree of the function must be known beforehand and thus the obser- 

vations do not provide any justification for the form of function chosen. If however the 
function has been fully investigated beforehand and there is no doubt about us form, 
(n + 1) equally big groups of observations placed as indicated are the most desirable 
set of observations possible. The approximate values of the places of the groups 
are given in the table below. 


TABLE I. 
Degree of function Ist 2nd 3rd 4th 5th 6th 
1-0000 1-0000 1-0000 1-0000 1-0000 1-0000 
Places of — -0000 -4472 -6547 ‘7651 +8302 


observation = a — -0000 2852 -4689 
= = -0000 


With rougher approximation the intervals between the observations, still 
expressed by the half range as unit, are as follows: 


Ist degree of function 2 

2nd - es leceatl! 

3rd - ee eae 

Athy » eo ki 

bth » eet iad Se hee 
6th, » Bog a ie ae 


The six curves of standard deviation are represented in Diagram 1. It will be 
seen that the minima of a curve, if it has more than two, are the lower the 
greater their distances from the middle of the range, so that the variation of the 
standard deviation is greatest in the outermost intervals of the range. 


III. Uniform continuous distribution of observations with constant standard 
deviation. General formulae. 


(1) As was pointed out in the last section the lumping up of observations in 
groups just necessary to determine the constants of the function in question has 
some drawbacks and cannot be recommended as a universal rule. In many cases 
it is through the observations themselves that we first get to know the form of the 
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function, and thus a full investigation may require more groups of observations 
than merely a number equal to the assumed number of constants in the formula. 
Besides, even when we believe we know on theoretical or other grounds before- 
hand the nature of the function a priori we may consider it prudent to distribute 
the observations so that they supply us with data whereby we may control our 
hypothesis that the assumed function is the right one. 

It is therefore desirable to find other forms of distributions which, at the same 
time as they make the standard deviation of the adjusted function vary little 
inside the range of observations, are more uniformly spread over this range. 


(2) A uniform continuous distribution at once recommends itself as the simplest 
assumption. As we suppose the observations to have constant standard deviations 
the elements of the determinants of (15) are the moment coefficients of the a’s at 
the places of observation. 


When the N observations are uniformly spread between « = — 1 and a = 1, 
bar = —c4 and plor44 = 9, 
and the expression for ,0,, is, according to (15), 
ees ale 1 py | 
ee el. 
vee N bs 1 pe eres | 

ig oemmia Ma Be 

| 1 Be [pte Oo Or M2n-2 

| OS Saat ay aoe bop 

4. g2 PP Hey Mapig Hap—4 

Me (Poe Sona: Map—2 Me [a dooce Pap 


Map-2 Map veers: Map-6 | | Man Monte +++: Hap-2 


1 1 [lg Manes ds P2op-2 | 
UP fg ag veeees Lop 
se EP? Hen Mapig s+ Mayo | ae (23) 
| 1 [eapiavncse Popo 1 Tp peer Lop j 
| [Ppp [Hel ocaor Pap | be fea eiereciecie Men+e 
: ; | : 
| Man-2 Map --++++ Map—4 | ope ep -Hore ssc Map 
By this formula we may evaluate successively j0,, 20, -.. 2%, When we know 


the two general terms of which the sum consists. 
2—2 


a 
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(3) The determinant of the order p, 


1 1 1 
2a Nisei 2p = 3 
4 Paiio a eee 
jh= 2¢g+1 20 Ooh a 2g+2p—1 |}, 
| é : : 
| 1 1 1 
Pog p= 3 Deepal a 2q + 4p — 5 
which includes the two types of the denominators in (23), shall first be evaluated. 
We find A Meee and yA EY seein Be 
2q—1 : 2q — 1) (2q + 1)? (2q + 38)” 


and it shall be proved that if 
,A=(lP-1 97-2) (p—2)8p_— 1)? 2° 0) (24) 


q q 
up to the order p, ,II being the product of the elements of ,A, the rule holds 
for determinants of any order. 


It is clear that 
q+2 qd 


qd Opal 
+1441 = BSS 


q qa 

— i 

p+iA 41, p41 im oA, p+1At1, p Lz ae pA 
q q+? 

and pio, ptt11 = ot. 
If we therefore in the general relation for an orthosymmetrical determinant 

2 
Ney : Agy ae Ase 

sss's’ 


q 
put s=1 and s’=p+1 and A=—,,, A, we find 


A= 


q a+2 att 
in pA. >A— pA 
p+i q+2 ? 

p-1 


and, using (24), 


Co naa" 30 ate eee (er ae yeaa alll lil 


p+14 — {1-2 \7=3 ae (p — 3)? (p — 2) “i D 
-1 
Now, according to the definition of IT, ? 
qd q+2 
: eee ss aa (29 =) qa le (29 EO) ooneeS (2q all. 4p — 3)2 (29 aL 4p =) 
II Il , 
p+1 p+1 x (2q afk 2 aus 1)2, 
nae 
2 = (2g — 1)? (2q + 1)? (2q + 3)... (2g + 4p — 3)? (2g + 4p — 1)? 
pt ll? 
and 
q+2 
pall 


= (2g —1) (29 + 1)? (29 + 3)2...... (29 + 4p — 3)2(2¢ + 4p 1), 
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Hence 
qd qd 
pal = (1? 2772... (p — 2)3 (p — YR. QOV + TL. [(2q + 2p — 12 
. [527 — 1) Got Ap — 1, 
qd 
wal = (17.2772... (p— 12. pe. 22th. TT, 


which agrees with (24). 
q 
(4) Next we have to evaluate the minors of ,A necessary for calculating the 
q 
numerators in (23). For this purpose we only need the minors , A, ,, but to carry 


qd 
through the proof by induction ,A,,, for any values of s and r is needed. 


For ie 3 we directly find, 
q 


a ey, 
s23= Og — 1) Qq + 1) (29 +3) 2g +5) 
4 22, 2? 
and sAo,2 


(2g — 1) (2¢ + 3)? (2g + 7)’ 
these both agree with the following formula which will be proved by induction, 


q q 
ots p= (SH Le se ea) oye eh 0 an | are (9 —3)*(p.— 2)}7..2'2 VO es 
OR snot (25) 
er (posh a 
By-1,s-1 18 the binomial coefficient s—1 Eee and ,II,,, the product of all the 


elements of BAe. 

The relation has to be proved first for 7 = s, then for r = p and finally for any 
combination s and r. 

For the first two proofs we use the relation between the minors of an ortho- 


symmetrical determinant 


(NGS Aegis! . Ages" 5" = Noe S. (26) 
=. — arn Seema re ee ee a . 
Ags" NS Sasosh Nene aig Ay" 535! : Nard 


This is found from two relations given by Professor Pearson* by dividing one 
of them by the other. 


q 
(oye thet A be 5, A, s’ = 1 and s’’ = » +1, then 


q qd q qd 
2 
pitAss = v+tAss11 : pris, 8,p+1, 9+1 Sori AG ease ee (27) 
qa 7] a qd qa wool ad . 
+1 A, pt+1 +14), 1, 9+1, 9+1° pgs $51, 94-1 on pe Sh p+1,s,1° pit sa 1,5, p+1 
a q+2 
Now Pea Soucy pyle reer ’ 
qd q 
pas, 8, D+1, p+1 — pANGs , 
q 
—— 1 
Dei AG 1 o41, pe = ( Laue aE p? 


a q+1 
— D 
aye igo = (= iy paws 


* Biometrika, Vol. x1, pp. 232-3. 
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1 


p,S? 


Co] a a+ 
Sl p 
pit Ani, o4ie4 oa ( 1) pA 


qa q+1 
pe eve ot = (— 1) Ps a Nea 


so that all the determinants on the right side in (27) can be evaluated by (25). 
They all have the factor 


{Ul age yo Bee ae (p — 3)? (p — 2)}2 . 2(P-2) (2) 
in common, when that is divided out there remains 
q ; z q+2 q atl 
ptt Ass (= 1)PB i, 5-2 - Boa, s-1 (ost, 5-4 + ples = pLTs—1, 5) 


a = q+1 q+1 = q+1 a q+1 = sa0c00d60 
ptl Aion ferrari 7 fel aa (ues C Ally Te Di h0) 


Now indicating by C, the product of the elements of the rth column or rth 


q q 
row in ,,,A and by e,, the element of the ,,,A common for the rth row and sth 


column we find iy a OF 
p+ 1ttss = ptts—1,s—1° Or 
é14 . C4 
q qa 2 
ott pt+l1 
Pallet a pul. ot sae OT car po) 
Cp+1, p+1 + &p+i,s 
1 
at CiC a4 


qa 
mails aa Bll vecns . : 
1, p41 -&s,1+ 6s, p41 


Hence the factor of the numerator in (28) is reduced to 


a] 2 2 
Tp, 12 Ss 944 
p+1 


9 
ss C2 12 {eu p41, p+ — 41, pair 
Uist 


For the IT’s of the denominator we find 


q a4 CAG: 
per lla oar = p**s—1,1 - Prag , 
€51 +s, n41+ 1,1 
" Ga Coy - Os 
pt+itti,pt+i ™ p DS? G e en? 
‘s, D+1 ° “+1, p+1° “1,5 


q q+1 C OC 
pt+1'V1 
p+1 I, DLS pr D1 a = 


h 


1, m1 > P11: en4i, p41 

1 2 

é as a+ C? 

p+1 1, 9+1 > 9 s—1,s° ’ 
C55 es1 G es, p+1 


the factor containing II’s of the denominator of (28) is therefore equal to 


7 Cra e é 
2 1s°* ‘%p4+i,s* %1,1 ° p+), 94+1 
p+1 IT, p+1 ; C,.0 C2 {ers + €n41,s — 1, p41: Css} 
ab evoapal OA 
Introducing these two expressions in (28) and substituting for the one factor 
q 
II OF als e 
: ae “_ the value —1—#+1 ,_—**_ we hence find 
II 8 1, p41 
pt+1**1, p+1 1 1 
q Bh cares bs oe q 
p+1 Ns a. (—1)"B B ei, p+ C11 - Cn 41, p+1 p+l1 ts 
qd an p—l, s—2/* p—1, s—1 ¥ 
‘A 1 1 q 
p+1“1, p+1 a p+1 I, pt+l 


Css-€1, p41 1, +P nti, s ; 
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The fraction containing e’s equals 


(2g + 2p—1P—(2g—MWQg+4p— ly) (Se 
(2q-+ 4s — 5) (29 + 2p — 1) — (2g + 2s — 3) (2g + 28+ 2p—3)  (s—1)(p—s +1)’ 
g qa 
hence gui Bss (See osillss 
p+1 A,, p+1 relly p+1 


As 


qa q+1 _ qd 
pasa (— )?,A —(—1)?{1P>. PUT ee (D228 (pee a Nope 
we therefore find 
q q 
p+ Ags = Bp, 5-1 {19 . 2? (=e) (p Aye oP ee alles, 
agreeing with (25). 


q q 
(6) To evaluate ,,,A, 51, we shall in (26) put A=,,,A,s=1, s’=s and 
s’’=p-+1. Reversing the fractions we then get 


qd qd qd q 
pt+1 A, p+1 _ pt+l A, 8, D+1, D+1* p+1 Ai, 1, 8, p+] si p+l Apes, p+1,1, 8° +1 A, s1, +1 
A A A ee 
pri “11 pt+1 1,1, 8,8 *° +1 "1,1, 9+1, 9+1 p+] 1,1, 8, p+1 
Pe (29) 

q qa 

As pt1 A, 8, D+1, D+1 — pees 
qd qt+2 qg+1 
p+1 Ai, 1,8, p+1 — rien, AS es Ay pn 
qd q+1 
pt1 Apu, pt+1,1,s — (ee ae ps, p>? 
q qt+1 
p+1 A,, 3,1, p+1 — (= ib) PNT, 8? 
q q+2 
p+1 Ay, MSS) aa BAe s-1? 
qa q 
p+1 Ai, 1, p+1, p+1 — DP Ay, 1> 
the right side of (29) can be evaluated by (25). 
We thus get 
q q+1 q+1 q q+2 
ots pia (= 1)? Boa s-1 + By-a,s—u + Pot. s-1 (Tolls sa - ols, » + pHs » plea, ») 
q cs q+2 a q+) 
ot Ani B;, 1, s—2 (pulls Ls—1<" polly cs ple ot) 
MM ASR Le ie Te ee Solis al oa (30). 


qd 
We want here to express the II’s of the numerator by ,,,H,,,, and those of 


q i : 
the denominator by ,,,11,,; and we find the following relations 


ti an Chae: 
ptitts, p41 — ptts,s-1*) ’ 
€1s-©1, n+1- Css 
1 Y 
tess: Cugstien 
ia iY apt at Dane 2 ie 
1, p+1° “+1, 9+1°%1,8 
1, 
ii ~ i Ope Oras 
p+l1 S370 toll Discs Si aa 


7 ? 
Cs, pti + Css + Op+i, p+ 
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2 
Re CAC, 
DELS Ss Pl oe De Sipe e e ? 
€1,1- 1s + 1, p41 
| qd q+2 C; f CG, 
anc — ess Bags 
amb aill OSes — il 2? 
Css . Crs 
q qd 2 
"| = pt+l1 
pt+1 Thy <a olla, : 2 ? 
Co+4, vt1 + &1, p+1 
q 1 
i pa TT Cr Cys = 
(rt oa tl ee Woe 1 sie 


es, p+1 + &1, n+1 + M1s 
Substituting the II’s found from these relations into (30) and eliminating the 
q 


alll 
one factor 7**—>#"* by 


oa lly ‘ 
pills pia Cs, pat 
; : 
‘ il Cx Crem 
p+ittyy 
we get - es yee! Drei 2 ; fF 
pris, p+1 (i ee peice €1,5-€1, pin 1,1 + &s, vt rrenllil p+1 
= = ———————— ee . > 
a Braisee 1 1 I 
Url et Bk 2 5 aie p+1**1,1 
: f ' spt+1  &ss+@ pti, p+ 
or introducing the values of the e’s 
qa 
nts, p41 
q 
ott A1,1 : 
4 (— ID) pean yee ie (29+ 2s — 3) (29¢-+2p— 1)—(2¢—1) q+ 2p +28 — 3)- spl Gru 
ois (24-+ 2p -+ 2s —3)2— (2q-+ 48 —5) (2q-+ 4p — 1) a 
p+1**1,1 
I 
pt+1 S, p+1 
= (al) Pee ae < 
p+ iH, 1 
Now 
A oe 1 9p-2 2 2 1) ' 
pit Ar1= pA={1?*. 272... (p12)? (pW) ee ten ells 
and hence 
q q 
— cs -1 
ane pdt Ce DER Oe dl oe en eee (p — 2)? (p — 1)}?. 279-9 | Ws par 


in agreement with (25). 


(7) It now remains to prove that (25) holds for Bs when both s and r 
are different from 1 and p+ 1, and r different from s. 

For this shall be used the relation 

ING ING sate? = ON gee Nea a Nee Ne 

between an orthosymmetrical determinant and its minors. 

Putting A= ,,,A, s= +1, s’ =r and s’”’ =s and solving the equation with 
regard to »,,A,,, we have 

1 


p+ Ay, 1, p+1 


where prion, p+1,7,s — pAr, s 


mee a (ny1A y pti, p+1,7,s 3 PNET (pO ana saity a; 


bo 
Or 
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Evaluating this by (24) and (25) we get 


q 
p+1 INS Shwe 


ey i 50) cae (p — 2)? (p — 1)}2. 27?» 
q 
pul 
q a q q 
x [478 p41, Ss Bip, r—1° pall piles a5 Bo. Sige. Bo, peo pyle mee panna alee otl): 


wi g e ey e 
But Il ae Il p+i, p+1° %7, 9+1° Vs, +1 

prrr,s pt+1~* r,s 2 

Cran 
I eat i Cs en p41 

p+1**p+1,7 ~~ n+l ESS Gl < e ee eee 

pt+1 ers 

q ie (G2 
=" pt+l1 
and Pec, Ul Bea ae 
p+l1, p+1 


il Eo ll (Coke Bhiras : 
pt+1**p+1,s ~~ C ‘aa 7 
S 


C+, +1 


Substituting these values in (31) we find 


q q 


pee — (tts 12—* Q-2. 2.(p —2)2(p —1)}2. 2? e—™ Sy Pray oe aes Py dea ae 


1 
2 es ie 4 
I I 


x 


Cr pt1 €s, p41 


and as the last fraction equals 
il 


(p—s+1)(p—7+ 1)’ 
a ( 
pet, = (— MB y--1-Bor1l?4. 27-2... (p—2)*(p— P29? Tl, 
with which the proof by induction for (25) is carried through. 


(8) We shall now return to (23). It consists of 2p + 1 terms of which the 
(2r + 1)st originally was found as (5,0; — »,10;) so that 


1 1 Pe Aes, peal 
Me MI 
x 1 arene ! 
3 2 or+1 
as ees a eal 
: go! Oph Onagno dy — 1 
Bry ee yer 1 Ou — ar i . 7! aN 1 , il 
i Nata atl i ene aren 
‘ i als 1 
3 5 Co ME Se or ii il 3 4 re op mie 3 
1 1 mies Ae l 1 
Fi aoa] SOE a IN| OVE 2, gs Ap + 1 | 


and 1 A 1 2 
4 RE ys | 
il 
2 1 Al — 
x 3 amo gies Joao 
Mere 1 1 1 
° - rosa ; 2r+1 24307" 4r — 3 
Qr-14 yor -24y = AT ’ 
N | i ¥ 1 [eal i fi ae 
| 3 Bl scat a : | ees: pel 
1 Ls ae ee at 4 7) DGuuene : 
2 2rd 2 Wr +3 
1 il 1 Il } il 1 
Wige2Y pee Lig amine peor 4r — 5 | Ort 1 Dips By 4r — 1 


With the notations later adopted we therefore find 


s=Pr 1 2 
2 
2 S ae eerarae (AV rosea 
s= 


o ot 
PACA ie PY 5 REY ae 1 1 
s aN : ri A 
s=r-1 2 2 
p o7 22 | 8 [a Ase at 
and 2r—-1 Fy — ar-2Fy = N- ae ) 2 
NAAN 
Substituting the values for A’s from (24) and (25) we get 
: r il ayy) 
9 9 o san s+1, 
arFy — gr—19y = N. 2?" (ir? oc | (— LS omer. aia ae ait aa | 
ea eee, is le 
E pit + rt 
and e 2 ane 
eh Cnr s=r-1 | ry ae 
s=0 
a oe L LY feed MW 
or, as ee . 
i 
eae a ey (2s + 1) (28 + 3)...:.. (2s + Qr — 1) 
Jia Sat Cost - Viena) rt1 
and 
2 
abiens eG aa —— = V 47 = 7 (28 43)(25 45)... (Qs +271) 
yt Ul +1 Cr, r 
i Pie oie Care al Be ie : ‘ 2 
arFy — 2r-1 Fy = ines | Sl 1s a(2s-— Dyi(2s Sys. (2s + 27 — } 
IR me Fs 5, (32), 


* The e’s and C do not of course have the same value in the two equations as they represent columns 
and elements in two different determinants. 
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and 


: o? (47 — 1) 2? {is ic 1)°B,-4, «2? (2s + 3) (2s + 5)...... 


2r-ay — 27-200 (jy — 1)? 227-2 


s=0 


(Decor Wy} ee a (33), 


2 


4 


. F . . ‘ oO 
which enables us to form ,,o” by successive summations from go; = = 


N° 
Before investigating the curve for ,o;, for a special n we shall first look at ,,o° 
for «=0 and «= +1. 
(9) From (33) we see that when « = 0 
2r-1%y = 2 rm 
»po, is for = 0 most easily evaluated from the formula (13). 


1 : 
Remembering that in our case m,, = yaaa we find from this 
1 
ssa Ys CA 
ies 9°, ee 
pA 
and hence by (24) and (25) 
va eee as Nee Coa Oe BOE al 2p + 1)2 
2p419%y = 2pFy = N (2 Begins Qp ; sjeiaraisverciavs,e ele! s) oveseie's (34) 
(10) To evaluate ,,o? for x= -+ 1 we use (32) and (33). The sum in (32) may 
be considered as Galed 1 d {a2"-1 (x2 — 1) 
han dane. Coe dee) 


with a number r of differentiations. If these operations are undertaken directly 
upon x?"-1 (x2 — 1)" the result is 
One wl) a, (@" = 1)?-t sss a, (w— 1)+ ap, 
of which only Gy Oh ore 2) ceases Sl ge A 
remains for x= + 1. 7 


Corresponding to this the sum in (33) comes out from 


ae dld GaN drla (a7 ail) 
eda ean) | dx x dx 
by taking (r — 1) differentiations and therefore 
S5,-1 equals, for = +1, (2r—2)(2r—4)......4.2= |r —1. 27-1, 
xv=1 v=1 oc 
Hence ory — or-19) = N (47 + 1) 
z?=1 e=1 o2 
and Ct ga 0, = N (47 — 1), 
or since eC 
> 097 = N > 
aa =1 o2 
ny = il + 35+ direst (2n + 1)}, 
xz?=1 o2 
por N (Heist eee pera ater taint te eetheak Cok hs Sct nes deve es (35) 
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(11) In Section I under (7) it was found that 
¥ (2) ode =o? 
Ve no dz = o7 (n+ 1) 
when the integration was taken over the places of observation. For the present 


distribution f(a) is 1, % (x) constant and fy (~) dx = N, hence the mean of ,o? in 
the range of observations is for a uniform continuous distribution 


For the grouped observations in Section II we find by integration of the 
formulae for functions from the first to the sixth degree that 


[eth Se cee 1 
5] noid wet (l oct) 


IV. Uniform continuous distribution of observations with constant standard 
deviation. Special formulae. 
(1) Let ,o% — ,_,0% be indicated by S,, then the formulae (32) and (33) 


give us 2 
: See 3a 
oe 15) 
Sy = Nos (1 — 327)? 
2 
S,= 5.1 a2 (8 — 5a) 
2 9 bg actosk asia (36), 
= 2 (8 8072 Baan 
Sa=q- gg 3 30x AE ee) 
(eZ eam) Esc aes Ley ro 
Ss= 7° 64 (15 — 70x? + 632%) 
g oo 1S (153 lbaty 94ba8 69808) 
Bs = 9x 056 | 3) ODA = ox IID ) 
from which we form ,,o? beginning with 
Pf as o ‘\ 
0Fy — N | 
2 
104 = 5 (1+ 304) 
9 oe 9 5 2\2 o 9 1 SY) 21 Ard 
ay = yt Oe id (te) yan x2 + Bat) 
and further in the same way 
2 
32 = = ; ; (9 + 4522 — 16524 + 17528) » (37) 
ae ae 2 4 6440° + 44108 
4, = N 64 ( — 3622 + 29424 — eyo + av ) | 
Aor = = : = (25 + 17502 — 1750a4 + 6510a8 — 955528 + 4851219) | 
6%, c = (175 — 105022 + 1732524 — 9366028 + 22522528 + | 


— 2453220" + 99099212) / 
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(2) Since ,o7 = ,-,0,+S, the curve for ,o; is entirely above the ,_,0;, curve 
except where S, = 0. 
Solving the equations S, = 0 the following roots are found: 


Hors; — 0 |) 
» S2=0 a= +Vi= 4 -5773 
» 8,=0 a= t= +V2= 4 -7746 
} 15+2V30  — (-8611 
pies hae J 35. «(+3400 
35-42V70_——-(-9030 
ees oe va arl® 63. «(+5438 
ane 
eS 0 x= + }-6612 
9325 


Since all the roots are rational and all le between — 1 and + 1,,,0; therefore 
equals ,_,0, for n values of x all of which are inside the range of the observations. 

The adjusted values of the functions at these abscissae appear to be of special 
interest since they are uncorrelated as was shown in Section I under (9). 

(3) Looking at Diagram 2, representing the curves of ,,o, up to n = 6, it is seen, 


na) v=1 
as was also clear from the formula for o% and o% given in the last section, that 


while the standard deviation in the middle of the range increases slowly with the 
degree of function it increases very rapidly at the ends of the range. At x= 0 
the curve has a minimum when the degree of function is odd and a maximum when 
it is even. Besides that the curve has (2n — 2) maxima and minima between 
—land1. As the curve for ,o? is of the 2nth degree, ,,a; is therefore increasing 
for x increasing above | or for x decreasing below — 1. 


The abscissae of the maxima and minima are given in the following table. 


Deg f 
RR PtOA Abscissae of maxima Abscissae of minima 
1 0 
2 0 » EVE= + 4472 
: 0 
3 +Vi= + 4472 eater 
(hat [TET , (-7651 
. l4Ve = + -6547 NE = een = |-2859 
aoe [ oe 
d fs dt LO (ise | [5 +2715 (-8302 
1 -2852 |= 3300—~Ci«<“‘<«é‘C#&L 468899 
fegree eri . 
6 | Je +2VI5__ |-8302 eee 
= 33 ~ ~ 14689 5 |.2093 
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Hence the curve for ,,,,0, has a maximum for the abscissae at which 4,0; has 
aminimum. A comparison with the results in Section II shows that the abscissae 
of the maxima found here are the same as those of the best places of observation 
for (n+ 1) equally big groups of observations of a function of the nth degree. 
These places tally with the places where ,o%; was a maximum. Thus if we 
imagine that we had started the investigations with a uniform distribution of 
observations, and to lower the maxima of the curve of standard deviation had put 
clusters of observations at those maxima and at the ends of the range we should not 
get the best curve of standard deviation till all the observations of the continuous 
distribution had been distributed at the n — 1 places of maxima and at 1 and — 1. 


The minima of the standard deviations obtained from a uniform continuous 
distribution and the (n + 1) best groups of observations do not fall at the same 
. * 
abscissae. 


(4) The curves are very far from our ideal of a constant standard deviation 
throughout the range. To obtain the same maximum of standard deviation as 
(n + 1) groups could give us we should have to limit the part of the range used to 
the following fractions of the range: 


for Ist degree +58 
ONG s::s, 73 
POLO ares -80 
petth =, “84 
PSOne ts 83 
» Oth ,, ‘73 


It is not likely that the range of values of the function which we investigate 
would only be of interest inside a range so much smaller than that within which 
we might actually observe; further it seems likely that observations all of which 
were taken inside the smaller part of the range would give better information for 
that special interval. I shall therefore examine in the following sections if a uniform 
distribution of observations to which is added clusters of observations at the ends 
of the range will not possibly give a more satisfactory curve of standard deviations. 


V. Umiform continuous distribution of observations with additional observations 
clustered at the ends of the range; constant standard deviation of observations. 
General formulae. 


(1) Suppose we have V observations uniformly distributed from — 1 


piesa 


to 1 and besides N i = 
2 l+a 
then have 1 i Nx" Na | 


observations at — 1 and the same number at 1. We 


ee NM ea) oe Lene 
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or 


and 


- oleate) 
ele wei Ose Tl, 
Margi = 0. 


According to (13) and (14) we find, 


a o° (1 + a) 
2pVy N i 
+ ae 


ie 1 ai 
| 
1 l+a gta 
| 2 tta 3+4 
ep A pall: _t + Qa 
pel’? Ww+3 
l+a se @ see 
i+a t+a er 
oe a ae 
2p +1 idee 
0 1 a 
1 y+ a 5 ta 
x t++a rata Oe 
1 1 
2p—2 | 
‘ Spee aae 2pewOw 
t+a t#+a ee iets 
tra 1t¢ aoood 
| 
| 1 i 1 i 
one a Ip +3 D woeee 
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eeeece 


a eeeee 


COCHOn Aa 


joo (88) 


and 


Biometrika x11 
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0 I seem ARE y2p—2 
1 
il lt+a ek ene Fees 
2 1 1 1 | 
x x +a 5s +a GORtGaO Seu 
Il 1 Il 
| n2p—2 2 : 
ea? ii aks a sme erg S 
1 
il soya eee". aN Sec es esis e 
1 1 1 | 
gta 5 +a ddéacas 2p +1 Qa | 
| 
1 1 1 
phe Spam o Apooop pes aee & 
0 1 Ce qpo 
1 1 I 
i 3 +a i pes acctece page 
2 1 1 1 
aw 5 +a 7 +o SOBASO m+3°% 
Deve att ane : +a 
2p+1 2p 3 4p —1 } 
ae 1 1 j 
gta ig ar Gk Sasonc Srpmatenuae 
| tra t+a : roy) 
| 4 Parishes acdsee op +3 | 
mat mala : IP : =e 
2p +1 a 2p +3 sevens 4p — 1 
sabretencmesaloo) 5 
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and, according to I under (6), 


eerece 


‘ Brees 
205, — 2p-19y = N (1 ale a) x 
il 2 
1 l+a Taw Seed ae 1 +a 
x tra t+a : +a 
1 BG tu erence Pee 
x tig Liat cOemn ibe! : ae i 
yi ‘ 2p +3 
p2p 1 I | 1 + 
2 pea yeas vee nih ee 
l+a Eich. y Yuviaee pa l+a t+a 
k+a 1 Leas! Lita L+a 
3 5 (OB “aeGno0 2p +1 Qa 3 5 
bids 1 ligt 1 Ian 
pe ss Hop ela pe ee iy 28) |) so pe mores 
and 
gens hasiie)en 
20417 — apFy uae 
1 1 if 
1 t+a tta oo... paar 
1 t+a ¢++a : als 
z teh, adedos Ip 43 a 
x4 aL JE. Gi EE ohy anny. d +a 
; : 29 + 5 
ye 1 | z ~ | 1 
eS pee 6 oe foe ek 
1 
t+a Tod, wees amen ata t+a 
1 
tt+a Laat Sn yee t+ta t+ta 
he we Dee eee Loe 
Qp+1° 2p +3 aa pe io 2p +5 
(2) For the reduction of these formulae we have to evaluate the determinant 


of pth order 
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35 
later, _1 I EY 
pan Screens Ne 2q-+ 2p—3 
; cara 1 i be 
—— i Se a Oe Boeood Rs a 
;o= 2¢+] 2¢+3 2qg+2p—1 
a ee = tai Tees ae aS I ah 
2g + 2p —3 eerie Seetpepek 


By subtracting from the elements of each row the elements of the proceeding 


and leaving the first row as it is, it is transformed to 
qd 


po — Le x 
aes Ls ae eee 
Anil Qa opeedl OC aoccno 2¢-+ 2p —3 Qa | 
2 2 2 | 
(2¢ — 1) (2g + 1) (Qg-+1)2g+3) 077" (29 + 2p -- 3) (2g + 2p — 1) | 
5 ; 9 | 


=" | 


(q+ 2p —5) q+ 2p—3) (q+ 2p—3) q+ 2p—1) Bq + 4p —7) qt 4p—5) 


which when the columns undergo the same process takes the form 


qa 
po a 
| Le 2 2 2 
2g—-1 °° (2q — 1) (29 + 1) (2¢+1)@q+3) *(2q+2p—5) (2+ 2p—3) 
2 2.4 2 2.4 
(29—1)(2g+1) (2q — 1) (2g + 1) (2g + 8) (2¢ + 1) (2¢ + 3) (2g + 5) (2g+2p—5)...(29+2p—1) 
: 2 ae Ph geen a ae 
q+ 1) (2¢+3) (2g +1) (2¢ +3) (2¢ + 5) (2¢ + 3) (2g + 5) (2¢ + 7)" (2g-+2p—3)...(2g+2p+1) 
2 al 24). ; DA a 2.4 
(2g+2p—5)(2q+2p—3) (2g+2p—5)...(2g+2p—1) (2g+2p—3)...(2g+2p+1) (2q+4p—9)...(2g—4p—5) 


Let us introduce the notation 


i 1 | 
Qg—=1) q+ Dq+3) — Cg F N+ HRTHS) q+ 2p— 3)... q+ 2p +1) 
1 1 1 
wD=| (2¢+1) 2g +3) 2q+ 5) * (2q + 2p —1)... (2g + 2p + 3)] 


] 


(2q + 3) (2q + 5) (2¢ + 7) 


1 


1 


(2g + 2p — 3)... (2+ 2p+1) (Qq¢+2p—1)... Qq¢+ 2p+ 3) (2g + 4p—5)... (2¢+4p—1) 


q q 
Then, since for a= 0 ,6 equals the determinant ,A, we have 


q q oe 
BO = Pye aor ee tee 


qd 
and the problem is reduced to the evaluation of , D. 
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(3) It shall be proved by induction that 
q 
2D 


{1P, 2P-1_,, (p —1)?. pt? . 2P(P—2) (p +1) ; 
(2q = 1) (2g + 1)?(2y +8)... (2q + Bp — 5) (2q + Bp — 3)? (2g + Bp — 1) Bq + Bp +1)? Bq + Bp + 3)P—... (2q + sp —3)*(2g + Ap — 1) 


It contains the 2p + 1 different factors of the elements with indices increasing 
from | at the extreme to p in the middle so that the three factors of which the one 
diagonal line of the determinant consists occur with the index p. 


For p= 1 the formula gives 


B 1 
1 (2g = 1) (2¢ + 1) (2¢ + 3) 


as it ought to. 


As the determinant is orthosymmetrical the relation 
a ee a 


vied 
SSS 


A holds: 


gd 
Applied on ,,,D for s= 1 and s’ = p+ 1 it may be written 
PR Chet Girl 


DEO ae 
aD = (44). 
yO=Al 


Looking first at the numerator of (43) we see that it has the same value for the 
two terms of the numerator of (44), and divided by the corresponding factor of 
q+? 

D it becomes 

ei ets sete (p — 2)§(p— aga) 2(p+ 1)? gap (p-2)—(-1) (2-3) 

ae Oma a meee ((=2)2iG 1s eee 
oe 
Ee Ee Ue tener (p — 2)4(p—1)3 p? (pt Dg 2 


pi 


qd 3 
To evaluate the factor in ,,,D arising from the denominator of (43) we shall 
give a table of the indices with which the different factors occur in the D’s and 
their ratios. 


2q—-1L2a+l 2q+3 ... 2g+2p-5 2q+2p—-3 2q4+2p—-1 2q+2p41 2+2p+3 2q+2p+5 2a+2p+7 ... 2qt4y—1 2a+4p+1 2a+4p43 


5 1 2 Bh ee 9 Di—l Pp p p p-] (Da 7 Di tonmace 1 ay ria 
q+2 
pD dE bog GSS) p-2 p-1 Pp Pp p p= Tee 3 2 I 
q+2 
pat) =) i 9 Na ep 8 p-2 p-1l p-1 p-l jas Zs Dp 13) Bes. 1 ae a, 
q+1 
ee — 2 4.,...2(p-2) 2(p-1) 2p 2p 2(p-1) 2(p- 2)... 4 2 sae 
qd q72 
D.,D : 
P12 8B pol p pb. pee. ip p ia a 2 : 
p-1 
q+1 
pi? ¢ ‘ . 
—2- am 2 8 ue pel Pp p+l ptl ptl p pads ess Z ae 
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Hence the factor arising from the denominator of (43) is 
(2q + 2p — 1) (2g +2p+3) — (2g —1) (2¢g+4p+3) ; 
(2g —1) (2g +1)?... (2g + 2p —3)” (2q + 2p —1)P*1(2¢ +29 +.1)Pt1 (2g +2p +3)? (2+ 2p +5)... (2g +4p +1)? (2g +4p4+3) ° 


The numerator of this equals 4p (p+ 2), 
multiplying with the factor previously found we therefore get 


piD 
{]ptl QP _.. (p—1)8 p? (p+1)} \h2) Q(t) (p—1) (pts 2) 


~ (2g —1)(2g41)2... (2q-+ 2p — 3)” (2q + 2p — DPF (ag. + 2p+1)P*1( 2g + 2p +3) (2q +2p +5)... (2g +4p +1)? (2q+4p4+3)’ 


which is what we wanted to prove. 


(4) When the values of A and D are introduced in (42) we get 
= {1P=2 , 2-2... (p— 2)? (p—1)}2. 220) 
Y (2q=1) (2g 41)... (2g + 2p —5)P— (2g + 2p — 3)? (2q + 2p — 1)? ... (29 +4p —7)? (2g + 4p —5) 
+a .23(P-1) x 


{[P-t , Qp-2.... (p —2)? (p —1)}?2. 20-YP3S) 

(2q —1) (2g +1)? ... (2g +2p — 7)P-2 (2g + 2p —5)P-1 (2g + 2p —3)P-1 (2g + 2p —1)P-4 (2g +2p4+1)P?... (2g +4p —7)? (2¢ +4p —- 5) 
or r= pw ale ak Ce a he a dnadigensiiaee ss (45). 
(2q —1) (2g +1)? ... (2g + 2p —5)?-1 (2¢ 4 3)? (2g +2p —1)"-1 ... (2g +4p—7)? (2¢ +4p — 5) 


The denominators of oe eae (38)—(41) for ,o% are now known since they 


2 


2 
only consist of the factors 3 and,6. Tobe able to write down the general expression 
for ,o, we should have to mealies the minors of 6, but their form is so complicated 
that a direct calculation of the determinants for the degrees of function in question 
appears to be simpler. With the material in hand we are however able to deter- 
mine ,o7 for = 0 and 2? = 1. 

(5) From (38) and (39) we see that 
x=0 x=0 o2 5 . 2 ‘ 
ao, = ope WV ~— (1+ a), and with the 6’s as given by (45) 


p+19 
Hest) x=0 


29Fy = 2041%y = 
pea eee 2S) se i 08 Cm NP pee (2p 48)? (p45)? (4p =1)* (4p +1) 
N{1.2.3... p}?. 2” .[l+a(pt+l) (2p+1)]5. 72.98... (2p —1)?-2 (2p + 1)?! (2p £3)? (2p +5)... (4p - 22 (4p41) 
o* (1 +a) 3?..5?... (2p — 1)? (29+ 1)? .[1+ ap oa 
WN {1.2.3....p}?. 2?” [1+a(p+ 1) (2p+ 1)] 
se Paleo? (3.5 — 2p + ti? (eco alle ap (2p + 3)] (46. 
2p" apt 1oy = Ny 9° 4° .2p 2p j (1+ a (p- is 1) (2 2p ny -D] eialalerevs 


=] 
(6) To find ,.o? we have to evaluate the determinant of (p+ 1)st order, 


or 


0 iL Leg ae dae e Scr 1 
ib ae ead eee rs, 
2q —1 2q+1 Whore 2q¢ + 2p — 3 
(ee ee fae : 7 
Datel a 2q+3 On saiereiarais 2q +: 29+ p— 1 a 
1 1 1 
- a =e Se eh Hoppus ; see! 
2q + 2p—3 2q + 2p—1 2q+4p—5 
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q . 
Treating it as ,0 was treated under (2) of this section, except that now two 
rows or columns are left unaltered, it takes the form 


1 0) Ov” | a. eee 0 
ia 2 2 2 
ne a —_— a ——_ id eoniunneee 
2q-1 (2q -1) (2¢+1) (2q +1) (2¢ +3) (2g + 2p —5) (2q + 2p -3) 
2 2.4 2.4 2.4 
(2q-1) (2¢ +1) (2q — 1) (2q¢-+1) (2¢ +3) (2q¢ +1) (29 +3) (29+5) 0 (2q + 2p —5) ... (2g+ 2p - 1) 
2 yee) 2.4 2.4 i 
(2q +1) (2q¢+3) (2g +1) (2q +3) (2q¢ +5) (29 +3) (29 +5) (Qq+7) (2q + 2p —3) ... (2¢+2p +1) 
2 2.4 2.4 2.4 
(2q + 2p —5) (2¢ + 2p —3) 


(2g+2p-3) (2¢+2p—5)...(2¢+2p—1) (2¢4+2p-3)...(2g+2p+1) 77°" (2¢+4p -9) ... (2¢+4p — 5) 


q 
— — 93(p-1) 
= po 


Hence we find from (38), 
1 
1 62 3 Q3(P- 2) 
wl 6 ane) ? ,D Dy 


278) = N a 2 
oro po 
Now from (43) and (45) we get 
q 
2°” D (par 2g ot De (47), 
g [1 +a(p+1)(2¢+2p—1)] 
p19 
and therefore 
ey + a piensa ! 
Sue aN, lta(pt+1)(Qp+1)° l+ap(2p41) 
x=] o2 p+ 1 p l 
ee aya ate aie ...(48). 
or 2pFy Wau a) (2p + 1) ti +a(p aes 1) (Qp41)~ ] + ap (2p + 1)) ( ) 


In the same way we get from (39), 


1 2 
w=1 G2 (OEM =) yp) ORCA) 70) 
ap-19) = 77 (1 t a) +- = jp ey 


2 
po po 
which by the relation between »D and ,, Lp just found is reduced to 
eee p (2p — 1) p(2p + 1) ; 
aa Se ae -| eee rereee 49 
reat Oc ea cneae 1 + ap (2p + 1) 2 
Both (48) and (49) are covered by the formula 
x= o2 Nn nN + » 
pecs mn Baise sent 2200): 
poe Wl +a) (041) fe 1) sae SI ee) 


(7) The evaluation of ,,o7 for special values of n can be made easier by a trans- 
formation of the determinant 


0 —— ve e 
a St Et a NE, r 
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1 gee ee et fg | 
hee Doe 2¢+2p—3 | 
x eee get oe +a) 

: 29-1 2g+ 300 0" 2g + 2p —1 

pid = 4 1 1 1 

Me ig 1 dg45°% SeSOr Iqaaap Pal | 
: | 
SA ae sell cling Ae 3) 

“  2q4+2p-1 2g+2p+1 0°00" 2¢+4p—3 


Leaving the first row unaltered and subtracting from each of the others the 
proceeding we get a determinant the first column of which is 
Wert Ia (a — 1) 1.02? (a? — 1)s 
while the other columns are identical with those of the determinant 6 previously 
treated in the same way. When next the two first rows are left as they are and 
from each of the others is subtracted the proceeding one the result is 


Ba = (= L)2P~t x 
1 ar eS te: ae sie ea 
2q—1 2q+1 2g 4- 2p — 3 
ee on See 2 i De esta 
(2g — 1) (2¢ + 1) (2¢ + 1) (2¢ + 3) (2¢ + 2p — 38) (2q + 2p -- 1) 
(1 — a2) 2.4 ee Raley | 0h Bee: 
(2¢—1) Q¢+1)(2¢+3) (29+ 12 are Bas 2 (2¢+ 2p—3)...(2q+2p+ 1) 


2p-4/ Dea: A 2.4 
g2P-4( 1 —a?)2 ee ee 
(2g+ 2p—5)...(2g+2p—1) (2g+2p—-: = (2g+2p+ 1)" (2g +4p—7)...(2¢+4p—3) 
Leaving now three rows unaltered, next time four and so on, it is clear that we 
shall at last after p of these sets of operations get 


p(p+1) 


q 
oti aa (— 1) cae x 


1 il 1 
ae Sealy ee yaa | 99+ 2p—3 
| : 2 2 2 
eee (2q — 1) (2¢ + 1) (2q¢ + 1) (2q + 3) = (7 lp 3p = 1) 
(1 —22)2 ee 2.40 44 eee on 2.4 
| (2g — 1) (2g + 1) (2¢+38) (2¢+ 1) (2¢ + 3) (29+ 5) “(2g +2p—8)...... (2g +2p +1) 
(1-22) 2.4...2p 2.4 ...29 2.4... 2p 
| (2qg—1)...... (2g+2p—1) (2q+1)...... (2g+2p+1)° (2¢+2p—3)...... (2¢+ 4p --3) 


By treating the columns in the same way, leaving first two then three and so 
on unaltered, we find after the first set of operations 
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ae Fe (e iam ne 
1 : ae = 
ee @i— Det 0 Oe eae eee 
12 - DE ee 2.4 2s. 
. (2q — 1) (2¢ + 1) (2q — 1) (2¢ + 1) (2¢ + 3) (2q+ 2p—5).. ae 1) 
(eee 2.4 2.4.6 2.4.6 
(2g — 1) (2¢ + 1) (2¢ 4+ 3) GG lee EE) (2qg+ 2p —5)...(2g+ 29 +1) 
(1—22) 2A etaip 2.4... 2p (2p + 2) 2.4... 2p (2p + 2) 


p E 
(2qg—1) (2g+1)...(2¢+ 2p—1) (2g—1)(2q+1)...2q+ 2p+1) ~ (29+ 2p—5)...(2q¢+ 4p—3) 


and after (p — 1) sets of operations 


1 1 i 20 reo 2.4... (2p — 2) 

2q—1 (2¢ — 1) (2¢ + 1) ‘"" (2q—1)(2¢+1)...(2¢+ 2p— 3) 

L— 2 2 2.4 ‘ 2) A ret) 
(2q — 1) (2q + 1) (2g — 1) (2g + 1)(2qg+ 3)” (2q--1)(2g+])...(2q+ 2p—1) | 
(22)? 2.4 2 eae e 2.4... 2p (2p + 2) | 
(2g — 1) (29+ 1) (2¢+ 3) = (2¢—1)(2¢+1)(2¢ +38)(2¢ 4-5) (2qg—1)(2g +1)...(29+ 2+ 1) | 
(122)? 2.4...2p 2.4...2p (2p + 2) Pewee idee 2) Oi) 
(2g—1)(2¢4+1)...(2g+2p—1) (2g—1)(2q4+1)...(2g+ 2p+1) (2¢—1)(2¢+1)...(2¢+ 4p— 3) 

: p(p+1), (p= =e 
since (eae =(-— 1)”. 


Here the first element of the last » — 1 columns is seen to occur as factor for 
the whole column so that we can put outside the factor 
ieee (CAD PAD 72) 
Qq— IP Rq + DP Oy + 82 Aq + BP... q+ Bp — Hq + 2p — 3) 
Dat) 
aes 20st cA Dig 2) me) cae 
~ Qq— 19g + Rg + BP-* q+ DPF... Bg + Bp — 5) Og + Bp — 8)’ 


the resulting expression being x 
p(p-1) 
o (— 1)" .19-1. 29-2... (p—2)2(p—1)2 ? 
pen’ (2g — 1)P-3 (2g + 1)?-3 (2 7+ 3) “3 (2q + 2p — 5)® (q+ 2p — 3) 
| pies 
| It aT +a 1 I 
Pater ms as bse ee 
(2¢ — 1) (2¢+ 1) 2¢+3 a 2¢+2p—1 
Spygate ae Peas 2p (2p + 2) 
(2g — 1) (2¢ + 1) (2¢ + 3) (2q + 3) (2¢ + 5) "(2g + 2p — 1) (2g + 2p + 1) 

(1 — a2)? Digan 20 AUS tan (PDS 74) 2p... (4 — 2) 


(2g —1) (2g +1)...(2g+ 2p—1) (2q¢4+ 38)...(2g+ 2p +1)" (Qq + 2p — 1)...(2q + 4p — 3) 
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In our formulae the two cases g = 1 or g = 2 only occur for which according to 


this we find 
: ; p(p-1) 
Perse pe 
a 3P-1 5e-2 77-3 |, (2p — 3)? (2p — 1) : 
1 l+a 1 1 a 1 
(a 2 4 6 2p 
ae 3, 5 7 tai 2p +1 
(12)? oA 4.6 6.8 2p (2p +2) — 
ie WSS es DT 7.9 "(2p +1) (p43) 
teas 2-4-6 468 6.8.10 2p (2p + 2) (2p +4) _ 
ee ie 5.7.9 7.9.11 ~ Qp+1)2p+3)2p +5) 
jae 2.4...2p 4.6...2p+2) 6.8...(2p+4) 2p (2p +2)... (4p — 2) 
“ 1.3...2p+1) 5.7...(2p+ 3) 7.9...(2p4+5)  (2p+1)(29+ 3)...(4p9—1) 
cae (51) 
and 
p(p-1) 
Pe ol 22 tp = 22 (p= 1)2\ 
ate Dee (2p 12 (2p 1) 
1 sta 1 I she 1 
i ee ss £ 6 2p 
31,5 if 9 2p+3 
(1—22)2 2.40 4:6 poe) EPP aie 
/ Dey 7.9 9.11 (2p + 3) (2p +5) 
fea Bel 2p (2p + 2) (2p+ 4) 
 ee® oe Omieats (2p + 3) (2p +5) (2p +7) 
(22)? 2.4.6...29 4.6...(29+2) 6. 8...(2p +4) 2p (2p + 2) ... (4p — 2) 
~’ 3.5.7...(Qp+3) 7.9...(2p+5) 9.11...(2p+7)"" Qp+3)Qp+5)...(4p4+1) 


VI. Uniform continuous distribution of observations with additional clusters at the 
ends of the range; constant standard deviation of observations. Special formulae. 


(1) Our first task shall be to work out the formulae for ,,o% — ,,_,02 for values 
of n up to 6, the next to find what values should be given to a in order to make 
nO, as flat a curve as possible within the range of observations. 


With the notations just introduced (40) and (41) take the form 


ee 2 es 
Soy = enFy — an—-19%y N (1 +a) i=: 
pO : p+10 
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2 
2 2 o? 2 pid” 
and Sop ti = ep419y — aFy = ay (1 + a) @ 2 2° 
p+ 9119 
From these formulae we find, after applying (45), (51) and (52), 
o? 3(1+a)2? 
Ses ap ag nereesreereeerreceeteser eet senses cotecnseeeeuigrstere ea (53), 
2 1 32 5 I+a : 
(oy 0 a 
So wld ora la 22(1+4 2 3a) eee : 
o? 5 [2+ 3(1+a)(«2—1)P 
= oa eu vee (54), 
1 2 
aco! janet 3.52.7 (late 
Barra C1 ecaang 22 (1 + 2. Ba) 2 
1— 2? 
3.5 
o? 7 (1+a)a?(2+5 (1+ 8a) (2*—1)P (55) 
SS 3a) {le10e) ; 
1 1-ao = eee 
2 4 
o? 1.37.5 1.32.53.72.9 /2\2| 1-2 = = 
Bes iy cue ee 6) Be 
N 22(1+2.3a)° 22. 28(1 +3. 5a) \3 Ree 
— 2)\2 peta i 
(ie) 35 
_o? 9 (1+a)[8 + 20 (2 + 9a) (2% — 1) + 35 (1 + 6a) (a? — 1)°P (56) 
iN? Gta (2th) SS d 
1 sta 1 
_@ Bi GPT 3 50878, OF a 62 ee 
Ss WO mea oe al leet Bn5 7 
2 Aaa 
— p2)\2 poeeeree 
RGtaeecwiny 7.6 
_o 11 (1 +a)a*[8 + 28 (2+ 18a) (e?— 1) + 63 (1+ 10a) (@*@—1)P (57), 
N° 28 (1 + 10a) (1 + 21a) 
te 1 1 
6 
oe 
g, 814.4) LP. Lee ee Ps 
M6 NS T7982 98 (143 Ba)’ (22.3)2, 22 (144.70) \82.5/ | (y—g22 2 
8 a) EE Te) OL Se) ae 
2.4.6 4.6.8 6.8.10 
2\3 Me 
(12) 355.7 bem 9 Tagua 
_o® 13 (1+a)[16 +168 (1+ 10a)(a?—1) +126 (3+40a)(a?—1)?+231(1415a)(2*—1)P 
N° 28 rae (1 + 15a) (1 + 28a) 
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(2) We shall now look at ,o;, for special values of m and as a first attempt at 
z=0 g=1 
finding a flat curve for ,o;, try to make ,o', = ,07,. 


For a linear function we find, since 
je AD 
19y = 0Fy + S,, 


2 o 3 (1+ a) 2 = 
‘=> = tl esta et sr are ee ee rere 9). 
19y (i+ 1+ 3a 2°) (5 ) 


z=0 w=] 
As a is positive it is obvious that we cannot make ,o7 = ,o, which indeed we 


knew beforehand. This follows because we have proved that ,,;, is of 2nth degree 
and never lower. 


For « = 0 we find BAU oe 


which holds for any symmetrical distribution of observations with constant 
standard deviation. a is the ratio between the number of observations at the ends 
of the range and the number uniformly distributed through the range, it may 
3 (1+) 
‘14 3a 
flattest possible curve when a = ©, that is when the distribution of observations 
consists of two groups at the ends of the range. Then the curve is, as already shown 
in Section IT, 


therefore vary from 0 to 0. As decreases when a increases we get the 


o2 
N 

To get a check on the degree of the function and at the same time a flatter curve 
of o? than that obtained from a uniform distribution we may choose something 
between the two extreme cases and take for example +N observations at each end 
of the range and 3N uniformly distributed through the range. 


(1 + 2). 


9 
19y — 


Then a = 1 and, according to (59), 


o,= (lt ge NN 
with the maximum cats Salas 
i d Cy TNE e Lis 


(3) For a function of the second degree we find, from (46), 
Ce eer eco let) OO) 
se meeNaareclisGie so 

w=1 G2 ‘I 2 


and from (50), 207 = $73 (1 +4) { 


(1+ 3a" 1+ 6a. 
We want to make these equal and this requires 
3 (1 + 5a) (1 + 3a) = 4 {1 + 6a + 2 (1 + 3a)} 
or 15a* — 8a —3= 0. 


This has only one positive root a = -7873500. 
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For this value 2 ae a)’ which is the ratio between the number of observations 


at one end of the range and the total number of observations, is -2202562. 
As .0° = ,07 + 8, we find, from (59) and (54), 
ete USEC) Seca a) Ge )F 
20) = (1+ ae Tab Ge ) 
for a = :7873500 the curve is 


2 
.o = a {3-46837 — 6-2786222 + 6-278622%%, 


: ee 1 
which has minima at = + V2" 
al 


The extreme values in the range of observations are therefore 


a. SEER G9 pan 
y= Sry: 18624 for 2 He 


gel 

iN 

(4) For a function of the third degree we have, from (46), 
in are 9 (1+ a)(1+ 5a) 


30 


1319) ton a — pen (lel 


an d Oy 


YIN Ae Ge : 
= 5, 2 (I a) {i oe | asl ae (61). 
Hence the condition that they are equal is 
9 (1 + 5a) (1 + 10a) = 32 (2 + 15a) 
or 90a? — 69a —11=0, 
with one positive root a = -9021461. 
From (60) and (55) we find 
1=F (1 ; 3(1+a) 


1. 
and from (50), f 


5 [2+ 3 (1+) (e—1)}? 
Ca ee 1+ 6a 

ff (1+ a) 2 [2 al 5 (1+ 3a) (2?— as 

: (1 + 3a) (I + 10a) 


which for a = -9021461 becomes 


2 
— 3 {3-67775 + 17-78799x? — 48-56651a24 + 30-778522}. 

Besides the minimum for z = 0 this curve has other minima for x? = -815820 
and maxima for x? = -2361366. 


30 


The maxima and minima are as follows: 


+1 o 
H — lee ee | 
For x 1 6 a IN LES ILTER 
Co 
» cSt *48594. Oy = /N ry 2 3612, 
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By choosing a = -9021461, that is by taking -237139 x N observations at each 
end of the range, we seem therefore to have overshot our aim since the result is that 
we have got inside the range a maximum for o, greater than the value obtained for 
2=+1. 

(5) Our next attempt shall be to make 


x=1 x=0 


30, = 2 30 ye? 
It requires 9 (1+ 5a) (1 + 10a) = 16 (2 + 15a) 
or 450a? — 105a — 23 = 0. 


The only positive root is a = -3710723 which gives the curve 


2 
30, = a {2-730117 + 12-89741 2? — 37-0761224 + 26-9088225. 


The maxima and minima are: 


vee : ae 
For z 0000 Oy JN° 1-652, 
Shi iy rasa gets 
» £©=+ 4828 oy NGA 016, 
= -89 Eas 
» G=+ -8279 oe YN 1-678, 
(oy 
= Me ORE 
» ©= +1:0000 Oy NE Dat 
This distribution of observations makes o, for x= +1 greater than the 
maximum at «= +°4828. By interpolation between these two cases we shall 


now try to find an a, lying between those of our two trials, for which o, for 
x = +1 equals the maximum value of o, which still may be expected at about 
eo -48. 


x=1 
(6) In our first attempt we found o, = 1-918 and its difference from the 


vw 


-444, in the second attempt « “a —,- . 2-337 and its difference from 


UN 


maximum 


UN 
serene ) 
the maximum UN" 321, 


If the relation were linear this difference would be zero for 
“Oy= yy 2161. 
The a for which o. 3G, “ess this value is found by (61) which leads to 
8 (1 + a) (2 + 15a) = 2-1612 (1 + 6a) (1 + 10a) 
or 160-20? — 61-28a — 11-330 = 0, 
with the positive root a = -519. 


For this value (62) becomes 


2 
32 = wy (279866 + 14-2364a2 — 40-0058x + 27-4521}. 
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The maxima and minima are: 


For z= —_-0000 Oy = Tay TRB, 
» =+ 4843 oy Ty? 116, 
» ©=+ +8585 oy = Fy > 1-855, 
» ©=+1-0000 oy ry 2161, 


and this distribution which has -1708 x N observations at each end of the range 
may be considered satisfactory. 


(7) From (46) and (50) we find, for a function of the fourth degree, 
z=0 g2 295 (1+a)(1+ 14a) 
RS OE ee ie 

2 a | 


2 
q 1 2 5 1 es eae Nee 
anc 49 wv? 6 + a) tetas + ete 


which are equal when 
9(1 + 14a) (1 + ae = 64 (1+ 12a) 


or 1260a? — 552a — 55 = 0, 
that is when = -5217564. 
The formula for ,o?, found from (62) and (56), is 
o fy 3(1+a ) a 22+3( (lta)(@—-DP  T(1+a)e#[2+5(1+ 8a)(e—)P 


3 
Ni 


Nala ese 4 1+ 6a a7 (1+ 3a) (1+ 10a) 
_ 9 (L+e4) [8+ 20 (2 + 9a) (2? 1) + 35 (1 + 6a) (2? — 1)? P 63) 
' 64 (6a) aaa) | ook (63). 


For a = -5217564 it becomes 
2 
10 — wy £2°03367 — 19-727722? + 133:01711x2* — 235-96817x5 + 122-6786825}. 


The maxima and minima are as follows: 
0 o 


= See eS) 2) 
For « a 7 Cy va 44, 
» &= +-3130 oy Ty 2-041, 
» b= +-6844 oy TN 2-575, 
oO 
» D= +:9361 Oy = py: 1-856. 


e=1 
We have again as for the function of the third degree brought o, down below 


one of the maxima of ,o,, although since ,o, has a maximum at « = 0 the demand 


Ome ool 
that o, =, is not so exacting as for 50, which has a minimum at z = 0. 
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(8) We shall next she wee = 1-2671861 ee 
The condition obtained from (46) and (50) is 
9 x 1-2671861 (1 + 10a) (1 + 14a) = 64 (1 + 12a) 
or a® — -3095773a — -032940969 = 0, 
with the only positive root a = 3933269. 
Introducing this value of a in (63) we get 


2 
40, = a {461918 — 18-02388a? + 122-71833x4 — 220-34099x5 + 116-880728}. 


The maxima and minima for this curve are: 


Atc= 0 C= TN 2-149, 
(oy 

» B= + “BLLG oy = ay - 1-958, 

» b=+ +6839 Oy = Fy 2407, 

per 10014 a, oy 193 

» £=+1-0000 o, yy 29. 


We have thus for a = :3933269, that is by taking -141147 x N observations at 


xv=1 
each end of the range, succeeded in bringing ,o, down to be approximately equal 
to the highest of the maxima of the curve, thus fulfilling our purpose. 


(9) After our experiences in the cases of the functions of the third and fourth 
degree we cannot expect for a function of the fifth degree by making 


xw=1 x=0 


aes 2 
5 Oy = 59y 


to find a curve which has not a greater maximum than that value. We shall 
therefore start with the attempt 


5o 


The condition found from (46) and (50) is 
25 (1+ 14a) (1 + 21a) = 64 (2 + 35a) 
or 7350a? — 1365a — 103 = 0, 
with the only positive root a = :2433100. 


* The ratio 1-:2671861 results from consideration of a special eon curve. It was determined as 


that curve obtained from three groups of observations for which the standard deviation of o7’s within 
the range of observations was a minimum. It is not mentioned elsewhere in this memoir as it does not 
seem to have the interest I at first assumed it to have. 
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For ,o° we find, from (63) and (57), 


o {poe Zee 2, 5[(2+3( (1+a)(#—1)?  7(1+a)a?[245(1+ 3a)(a?—1)]* 


soy NY eer ag 1+ 6a va (1+ 3a) (1+ 10a) 
_ 9 (L+a) [8 + 20 (2 + 9a) (a* — 1) + 35 (1 + 6a) (a? -- 1)?/? 
GE (1 + 6a) (1 + 15a) 


4 iS 


11 (1 + a) a [8 + 28 (2 + 15a) (a* — aS 63 (1 + 10a) = (64) 
64 (1 + 10a) (1 + 21a) Goh : 


Introducing a = :2433100 we get 


2 
32 = wy {#14228  28-47030a2 — 258-05238a4 + 853-0448x — 1095-92128 


+ 476:599027}, 
from which we find the maxima and minima: 


At C= 0 Cus 7iNic oa 
¢= 4 -9958.o,= 2-273 

” y 4/N ’ 
, = + 5004 Oy = Fy: 2155, 
o= + --7853 oy ay? 762, 
yes AVY pe OUT 

99 ts Oy /N ’ 
, = +1-0000 Oy = Fy 2878. 


x?=1 

o, does not differ much from the greatest maximum and we may thus consider 
the distribution with -097848 « N observations at each end of the range for which 
a = *2433100 as satisfying fairly well our aim. 


(10) Considering our previous results we must assume that for a function of 
; v= /x=0 ; 
the sixth degree ai, | o, ought to be made somewhat smaller than 2 which was 
the value that gave a satisfying result for a function of the fifth degree. 


x ol x= 0 
Let us assume o, = 1-75 0;, or, substituting from (46) and (50), 


256 (1 + 24a) = 1-75 x 25 (1+ 21a) (1 + 27a) 
from which 567a2 — 92-43430a — 4:851429 = 0 
and a = +2048019 


are found. 
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For go, we get, from (64) and (58), 
_3(1 #4) 2 01243 (1 + a) Ga)? 


5 
=F {1 1+ 3a 4 1 + 6a 
7 (1-+a) 22[2 +5 (1 + 3a) (a2 — )P 
4 (essayen0e) 
9 l+a 
nod GaGa iakeabay 
11 l+a 


: 28 (2 + 15a) («® — 1) + 63 (1 + 10a) (a? — 1)22 
Se amet ee OE aye 1 


13 l+a 
! 16 + 168 (1 + 10a) (a? — 1 
OMMMInaeBay er Okt SAE 1) 


+ 126 (3 + 40a) (a? — 1)? + 231 (1 + 15a) (a? — Ff , 
which for a = -2048019 becomes 


+ 


(2 + 9a) (a? — 1) + 35 (1 + 6a) (a — 1)2]2 


so = © (558984 — 33:142342? + 504:452324 — 2512-67328 + 5524-18628 + 


— 5452-650x1° + 1974-020z12}. 
The maxima and minima are: 


Atz= 0 oO, = JN 2-364, 
r= 2216 o, = oN 2-216, 
» ®=+ +4826 oy = oy 2515, 
»t=+ 6194 o, 7 2-427, 
» C=+ 8445 a, =gy 3149, 
»t=+ 9615 o, ai 2-485, 
,», © = £1-0000 a, = ON 3128. 


It thus appears that this distribution which has -08499 x N observations at 


v=1 ; 
each end of the range fulfils our demand that o, shall be approximately equal 
to the greatest of the maxima. 


(11) We bring together our final results in the following table. It gives the 
distribution of observations, the maximum of o, within the range, the value of 


Vn-+ 1 or the lowest maximum of riven possible, which can only be obtained 
Oo 
by distributing the observations of the function of the nth degree into (n + 1) 


; ., /N : 
groups, and the value of n+ 1 which is the maximum of Py wv — for a uniform 
distribution. 
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TABLE II. 
Ratio of number of | ; 
Degree of observations at each | Maximum of let 
function end of the range to | oy JN m+ n+ 
the total number o 
ms | -2500 1-581 1-414 2 
2 | 2203 1-862 1-732 3 
3 -1708 2-161 2-000 4 
4 | ‘1411 2-467 2-236 5 
5 ‘0978 2-878 2-449 6 
6 “0850 3-149 2-646 U 


A comparison between our maximum and Vn + 1 shows the price we have to 
pay for information about the degree of the function. For lower degrees the 
maximum only differs quite insignificantly from Vn-+ 1, but with increasing 
degree the difference grows relatively greater for the sixth degree, being about 


one-fifth of Vn-+ 1. 


The curves of standard deviation for the three sets of distributions are given 
in Diagrams 3—8, while Diagram 9 represents the six curves just reached. 
It seems likely from the form of the o, curves that two clusters of observations 
placed at the outermost of the maxima besides the two clusters at the ends of the 
range would produce a o, curve with a lower maximum than the one we have 
succeeded in getting for the functions from the fourth to the sixth degree. But 
then again the position of these new clusters would depend on the degree of the 
function and thus make the proceedings more complicated; and what is more at 
the same time as the maximum of the curve approached Vn + 1 the distribution 
of observations would incur the disadvantages of the grouping in (n + 1) clusters. 
On the whole the distribution arrived at seems to be satisfactory and certainly 
marks a great progress from the uniform distribution. 


VII. 
(1) In Section I we have already given the formula for the standard deviation o, 
of an adjusted y when the standard deviation s, of an observation is o V f (a). 
It is 


Observations with varying standard deviation. 
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ae dx, (x) dx being the number of observations between 


x and x + dx and the integration being extended over the range of observations. 


it 
where m, = wif? 


It is clear that if we have found a suitable curve of squared standard deviation 
for adjusted y by taking a distribution ¢ (a) of observations with constant standard 
deviations a corresponding curve can be derived for observations with varying 
standard deviations by using the distribution 


ff (2) = h(a). f(Z)- anes eee tene ee ee (65). 
As fkd (x) .f (x) dz = N the constant k must be 

jetta ae 

[6 (@) f@) de’ 
{ we pi(a) da. Nu» 

Hence we find y= (CGE » On aes: 
where p, 1s the pth moment coefficient for the distribution ¢ (x), and as 

My N aa 

by FG (e).f (a) da 


for any p the determinant may be written 


opie Pian eM BE! 2 acai an 
1 ee a ie Lamesa [Di 
x fea. es [ag soo sh ends |= OG ose eon eeeneee (66) 
Dee Doe alls [i geeenroee Pye 
x” Bn Pnti Panta: Man 


We thus find the same determinant as the distribution ¢ (x) would give for 
observations with constant error of observation except that the factor / has come 
in, that is to say the expression for oj has been multiplied by 


; = ac Gide ae (67). 


The goodness of the distribution therefore will partly depend on the value of Ey ; 


and because we have found ¢ () the best distribution for observations with constant 
standard deviation it does not follow that 


wb (x) = kd (x) .f (2) 
is the best distribution for observations with the standard deviation oV f (2). 


But the deriving of (x) from ¢ (x) is nevertheless useful as a means of simplifying 
the investigations and will be applied in the following special inquiries. 


We shall consider two forms of f(x) and try to find the best distributions for 
functions of the first and of the second degree. 


on 
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(a) f(x) =(1+ az?)?, where a > —1, 
for errors of observation increasing or decreasing in both directions from the middle 
of the range. 
(b) f(x) =(1+az)?, where 1 >a 20, 
for error of observations increasing in one direction. 
These two forms will roughly cover two distinct and important types of cases, 
such as occur in practice. 
(2) When f(x) = (1 + ax)? we find, according to (67), 
1 
k 


= 1+ 2apy + a*p4, 


and as (66) for n = 1 gives 


o, ae: k= 5 te 2px + @}, 


we have for a function of the first degree 
5 O28 1+ ap, + a7, 
=F Et 
(ey [ei 
This curve has a minimum for 2 = p, and the maximum in the range is, if 
fy > 0, at c= — 1, and if p, <0, at x= 1; 1t equals in both cases 


2 ( (he [H4])?) 
2a, f Qa 4 aT Sete Oar 
ya an on ear 


C2 


[/4] being the numerical value of p1,. 


Now (69) is a minimum for pp, = 0; we therefore ought to choose that value 
for w, and we then get, from (68), 


9 


o ae 
y= ay (L + 2ape + apa) f+ et aieieieieiein(s\eze\sielelajeievess eis oles) (70), 


2 
2 

fd, and 4 may vary between 0 and 1 independently of each other and are only 

bound by the conditions that Lg = po 

and bo — ae 


For any set of values which satisfies these conditions we may determine a 


distribution consisting of YN observations at «= + v and (l—y)N at «a= 0, 
since from any two such values we could determine 


Pees 
and aa 


By introducing v? and y for ys and py we get two quite independent variables 
and (70) then takes the form 


(Ue Daya? + oP yo) ( 1 5) 


o 
N yu 


eee, 
Oy = 
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x=1 
We now have to determine y and v? so that the maximum value a; is as small 
as possible. We find 


doy]. 28 fo. a naan! 
Fal N (2ae + atvt — =) SoqnHOdsdsOb000000090009 (71) 
do? o2 1 
i Bl N (2ay + 2a®yv? + a? — al PR cco (72). 
do’, F ih 
Fraley 22 SS 3). 
Clearly Ee 0 leads to y a Ha) (73) 


Introducing this value into (72) we obtain 
2 


dv? jz-1 N- Va (2+ av?)/’ 
which is > 0. 


Hence the minimum for constant v? determined by 


decreases with v?. 


But when v? decreases, y?, as given by (73), increases and the lowest value of v?, 
for which it is real, is that determined by 
1 


ins $e TL 
Y~ avi (2-4 av?) 


For v? smaller than this (73) gives y?> 1, and as long as y2=1 we therefore 


do? 
y = (9) 
Es | x=1 


r=1 
Hence the minimum of o;, is to be found for y? = 1. 


have 


For this value (72) may be written as 


do™ o 1 f ae 
lat an NL (1. av) (2a0* av? — 1) rasa eee (74), 
ate 
which is zero for v= —d4+ ae sibountle nace s tee Sete eee (75) 


and > 0 for v? greater than this value. 


When the v? found les between 0 and 1, that is when a > 4, we have thus found 


9 


the minimum sought. When a = 2, then Fe 


as given by (74) is <0 and the 
xr=1 


x=1 
minimum of o;, is found by giving v? its maximum value, that is 1. 
Returning to the variates jz, and py we see that in all cases 


ae ee ik 


ee 
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from which it follows that no distribution of observations other than those arrived 
at consisting of two equally big groups can give py, fy and py the values required. 

We accordingly reach the result that: when observing a function of the first degree 
for which the standard deviation of the observations is o (1 + ax*), symmetrical about 
the middle of the range, we get the best function for o% by taking two equally big ig Hee, 


of observations, at the ends of the range if a =4 and atv=+4 pe Ye 1 ee a 
a>. 


(3) According to (70) the maximum of oj, for this distribution is 
xz=1 


aE on rs 1 
of = Fy (1+ av? (1+ 5), 
v being equal to | for a = 4 and v being determined by (75) fora > 4. 
We shall next consider the distributions (i) for which ¢ (x) is constant from — 1 


to 1 and (ii) for which ¢ (z) consists of | observations uniformly distributed from 


—1toland ii into two clusters. 


2 
(i) For a uniform distribution from —1 to 1 we have p,=4, y= and, 
according to (67), it 
ke => il + 20 -- ta? 
ae N 
the actual distribution is hence, as ¢ (x) = oe 
(1 + az?)? 


(x) aes eet 


and the maximum o%, as given by (70) for = + 1, 


x= + 1 g2 
on =Wl+3 2a + ta”). 4. 
(1) When ¢ (2) = - with the additional clusters i at + u we have 


Mo=§ +3 and py= ot du'. 


According to (70) the maximum o;, is then 


x=tl Gg 6 
a =Glitah tu) tay +s) (1+ 5074) 
We shall now determine wu so as to make this a minimum. We find that 
“L= 
do? 9 


y_o 1 ae s 
requires 
45a?u® + 15a (3 + 5a) ut + 5a (6 + Ta) u? — (90 — 5a + 9a?) = 0 ...(76), 
the root w? of which is > 1 for a <-5576. 


For a = 5576 we hence get the satin: by taking the clusters at wu = + 1 
and for a > -5576 at the places + u determined ay (76). 
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Table III contains for a series of values of a the values of v, (1+ av?) and wu 
of the two distributions above and the maximum o, for the three distributions. 


TABLE III. 
. /N 
| Maximum of aaa of pane of wy Soa 
| a JN pom a N: rk ree from See iien 
a v 1 + av? eee nice distribution u for which ¢ QF 
bution ad ytuce and clusters of 
$(x)=5 Nese 
2 4 
0 | 1:0000 | 1-000 1-414 2-000 1-0000 1-581 
4 | 1:0000 | 1-167 1-650 2-113 1:0000 1-760 
x | 1:0000 | 1-333 | 1-886 2-231 1-0000 1-944 
3 8836 | 1-390 | 2-100 2-352 1-0000 2-131 . 
3 “8071 1-434 | 2-284 | 2-477 9289 2-316 
3 ‘7510 | 1-470 | 2-448 2-603 8502 2-483 
1 ‘7071 1-500 2-598 | 2-733 STUN 2-637 
2 5559 | 1-618 | 3-330 3-540 -5762 3-438 
3 4782 | 1-686 | 3-908 4-382 -4925 4:173 b= 
+ -4278 | 1:732 4-404 5-241 4612 4-899 


The difference between the maxima from the two first distributions taken as a 
proportion of the maximum of the first decreases from 41 per cent. at a = 0 to the 
minimum 5 per cent. at a = 1, and then again increases to 19 per cent. ata =4. For 
small a, that is in practice a = 0, and again for a > 3, for which the difference is 
greater than 12 per cent., the third distribution may therefore be useful as giving 
a much smaller maximum value than the purely continuous distribution and at 
the same time offering some justification for the form of the function. 


(4) We shall next, still assuming that f(x) = (1 + ax?)*, consider the choice 
of observations for a function of the second degree. 
According to (66) and (67) we find 


pearl 
Cy= WT x 
(Maple — 3 +2 (abs — a Ha) © + (oa = Byes + py oes) 2? + 2 (My bea = Beg) ©? + (Meg = oa) ©) 
Mebha — pe: + 24 Me bs — Mi ba — BS 
See (77), 
and ; = 1+ ap, + aru,, 


where the p’s are the moment coefficients about «= 0 of the distribution ¢ (2) 
which is connected with the actual distribution y (a) by the relation 
ip (a) = ko (a) . f (2). 
From any distribution ¢ (x) which has 1, and pz; 2 0 we can form a symmetrical 
1 {4 (x) + 6 (— x)} which has the same py and py as (x). We shall prove that 
the maximum o? obtained from the symmetrical distribution is always lower than 
that obtained from the skew. 
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Let the factor in curled brackets in (77) be F, for a skew distribution ¢ (z) 
and F, for the corresponding symmetrical distribution. 


We then have 


Tyee Popa + (Ma — 3p5) te aH fiyDt 
ba (Ha — 2) 
The condition for a maximum or minimum other than that at «= 0 is 

3p — pla > 9, 
or Boao: 
and as the denominator is positive we have in that case the maximum at # = 0. 
It is thus clear that the maxima of F', between — 1 and | must be either at 2 = 0 
oratea=—+1. 


We shall show that 


[Fsle=o > [Pole=o; 
and that either Aa ora Maley Py leas 


According to what has been proved in Section I (4) the coefficient of x* in (77) is 
positive, the denominator of (77) is therefore positive and we have 


9 ne 2 a 
LP; —Fole=0 = 5 (Hats Hap) : aS 
- (lg — M2) (Haft — pea + 2pey blades — Mia — 5) 
We shall next compare F, and F, forz= +1. 


Putting [Fole-1= _ 
N—68 
we have Fay = 
[ alan D €E? 
where 8 = ps — 2py pg + pt & 2 {pry (1 — py) — oy (ue — pa)} 
and € = fs — 2py fofs + Mipa- 
For - we find 
O _ (Hs ~ Ma)” + 2 {ys (1 — pa) = pr (Ho — Ha)} 
€ (Hs — Papa)? + pei (fea — fo) , 
Looking first at the case “ = 0, we have 
M3 


(Hs — My)? < (Mg — py fs)”, 
and if we choose the value for which the other term of the numerator is < 0, 


) 
ar ls 
€ 
When = < 0 we see, from considering the form 
3 
5 = oe pi (1 = pg) — 2pty ps (1 — fog) + 2 {Hs (1 = bea) — Hy (Me — Pads 
€ (es — Habe)” + pei (Ha — p13) 


that for either x = 1 or z= — 1 


oS 
€ 
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: Nig 
As « > 0 we have hence for any w, and ws, remembering that = being a squared 


: D 
standard deviation multiplied by the number of observations is = 1, 

Ne ee 

D—« D ‘ 


that is, for either x = 1 or — 1, 
des ID 


We have thus proved that the maxima of F, are below those of F,. 


(5) Our problem is hence reduced to finding the best curve among those repre- 
sented by 
, _ (1 + apy + apg 
Oe = ao 
"IN pg (fa — 12) 
As was stated in (2) of this section we get all sets of possible values for py - 
and jy from three groups of observations symmetrical about x = 0, and we may 
therefore limit our search of the best distribution to these. 


) sisntta + (ey = Spt)? ee (78). 


Let the observations be TN at c=+2, at (l—y)N at «=0. The inter- 


polation formula of Lagrange gives, when 7, represents the mean of the observations 
at r= p, 
a Us aC) x (% +) _ 
Yy mo = vy Yo T Dy Yo ar We Yo> 


from which we find 


o2 1 ((x? = vu)? : a? (x? stk v?) (1 | at (79) 
Cee Toe | ea yon te aaa 
It is obvious that if for a certain distribution we have 
z=0 2=1 
Cy = Gy 


we can get a better distribution by taking more observations at 0. If on the other 
hand 


x=0 a= 
et REI fa 
x=1 
the curve cannot be the best unless o} is a minimum for the present values of v 


and y. From (79) we find f 


do. ope il \ —wvy? (1+7)(1+ oom (80) 
(@] oe awe ee 
1 do’, ao 1 As) (se v2—avt)(1+ a 
ee Eile ~ Nv eae y 
from which we obtain the conditions for maximum or minimum 
»_ 1+2Va 
y= 
a 
y 2Va(1 +Va)? 
and f= N=" 
ey ee OLV on Tl 
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The lower sign requires 3— 2V2=2a=Z} and the upper sign a>3+4+2/2 to 
make 02v?21. The case a < } has no interest, as we have seen that when a < }$ 
extrapolation is not even for a ices function advantageous. We have therefore 


— =) 
seen that for a<3+4+2V2* o% has no minimum and we have thus proved that 
; : : ; x= 0 x=1 
the best distribution requires o%, = ;,, that is 


Qe®—1 (1+ 0%) (1+ av)? 


= y ‘ 
i poles O>) (eas )e 
or en 1 a | aieroiefersy seis ersvelsierelsiaiaieiaisisieis (81) 
The maximum of the curve is - 
z—0 o2 i 
Oy N < ] er y 
To find the minimum of this value we differentiate (81) and get 
do* 1+ av? F ae 
ie Cee alte) 


which is zero for 


and positive for greater v?, so that we have found a minimum. 


9 


For a = 3 we find from (82) v? = 1, hence for a = 3 we have to choose v? = 1, 
from which, according to (81), follows 


=14+2(1+a)® or = Tea 7 


When 34+2V2>a>3, eas (1 ab / 38 ae =) is < 1, and for the corresponding 
Qa 


ee 


y we have 
ts aay 
(lL+av)? 1+ —5a+4+Va(38a+ 48) (83) 
ty, a.) (Qu? — 1) 8 (a + 2) sewer eta eoee . 


Returning to the ¢ (xz) distribution, which is found from this distribution by 
dividing the frequencies by k:. (1 + ax?)?, we therefore find, when - 5 Ni is the number 


of observations at « = + v and (1 — «) N, that, at x = 0, 


€ 
2 1+? 
T-e 2(%—1) 
w=] 
. . roe 2 . . . . 
* A further examination shows that for a>3+2V2 OF has a minimum but this is smaller than 
“7=0 w=1 x=0 
© Fj 9 fd 
a when a<6-7. Up to this value we therefore have a, =o, for the best curves. For a>6-7 the 


v=1 
. - 9 . . . . 
minimum of Gy determines the best distribution. 


ou 
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or cas 
ery 
Hence Pf, = 4 (1 + 2) 
NERD cerme nnn OR Rae Gahdnsooscas 84 
and a= ag (Lt) (84) 


For a = 3 we have found v? = 1 which according to (84) involves pz = fy, SO 
that only the distribution above consisting of three groups can realise the requisite 
conditions. 


When a > 3 we have v <1 and therefore p, < py, so that it must be possible 
to satisfy the equation (84) by a continuous distribution of observations. However 
v® is decreasing so slowly for increasing a that practically the distribution deter- 
mined by (84) cannot differ much from three groups of observations. 


Our results are accordingly that for a function of the second degree, of which the 
standard deviation of the observations is o (1+ ax®), we get the best function for a, 
when a = 3 by taking three groups of observations at the middle and the ends of the 
range, each group proportional to the squared standard deviation at the place, and when 
34+2V2>a> 3 by taking three groups of observations determined by (82) and (83). 


(6) From (78) we find 


z= 0 o~ L4 
a, = 57 (14+ 2ap, + a%u,) — 
! N Be Ma [a — 


which, when py, and p, are found in accordance with (82) and (84), determines the 
maximum a, arrived at from our special three groups of observations. Besides the’ 
numerical evaluation of this standard deviation, we give in Table IV below the 
maximum of o, obtained from a distribution for which ¢ (x) is constant from — 1 to 
1, that is, since, according to (67), 


the distribution pb (x) = 


That maximum is determined by 


2 
Oo a 
2 ee alle Pip Q 
Oo, N (1 | 30 1 5) 5 a) 
Gn : : 5 : : aa 3 ; : 
N° 9 being the maximum o;, obtained from a rectangular distribution of observations 


with the standard deviation o. 


The last column of the same table gives the maximum oa, arrived at when ¢ (2) 
z=0 @=1 
is the rectangular distribution with clusters at — 1 and 1 for which o, = oj. For 


this distribution consisting of -22026 N observations at +1 and at —1 and 
-5595 N = 2cN uniformly distributed from — 1 to 1, we have found as given in 
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“Tr 


Table II (p. 50) the maximum UN . 1-862. Hence when pz, and py are the moment 
coefficient of this ¢ (x) the maximum is found from 
2 
9 (oy 
oy = N ( 
We find pr. = -6270, py = °5524 and 


The actual distribution is hence 
-27975 (1 + az)? 
2) aes oe mm we Ee 
together with the clusters 


1 + Zap, + a?u,) . 1-862. 
1 


is 1 + 1-2540a + -5524a?. 


N, 


_+22026 (1 + a)? 
1+ 1-2540a-+ -5524a2**’ 


at — 1 and 1. 
TABLE IV. 
“3 : F N 
neue of | Maximum of a Maximum of ae for 
a roe eon for aes distribution with 
the best P ey (x) =c and clusters 
distribution RAE (2) 2 at +1 
0 1-732 3-000 1-862 
1 3-000 4-099 3-120 
2 4-359 5:310 4-453 
3 5-745 6-573 5°810 
+ 7-135 7-861 7-178 
| 5 8-522 9-165 8-551 


The difference between the first and second maxima taken as a proportion of the 
first varies from 79 per cent. at a = 0 to 8 per cent. at a = 5, while the difference 
between the first and the third maxima varies from 8 per cent. at a = 0 to 0-4 
per cent. at a=5. The continuous distribution with clusters is therefore 
especially useful for smaller a. 


For a = 4 we find from (82) v = -9816 and for a = 5, v = -9700, both of these 
values of v are so close to 1 that if instead of using them we take the observations 
at 1 and — 1 and let the numbers of the three groups of observations be proportional 
to the squared standard deviations we get the maxima 7-141 and 8-544 which only 
differ quite insignificantly from the corresponding values of Table IV. 


(7) For a function. of the first degree, of which the standard deviation of the 
observations is o (1 + ax), where 0 2a <1, we have, according to (66) and (67), 


» oF 1+ Zap, + aus 5 
2S lat DE OMEN Pian Meee coon aces : 
Oy = one {fg — 2p av + 27} (85) 
For p, = — c? the maximum of this function is at x = 1, and for p, = c? at — 1, 


As the maximum of (12 — 24,” + «?) has the same value in both cases it is clear 


o—2 
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that the negative , gives the lower maximum for o;. We therefore only have to 
find the conditions for [o;],, beg a minimum when p, < 0. 


We have 


(peg nee leah 
Laan bs — fy 
and differentiating with regard to ps, 
Fal 0? of (us = wi)? = (1 = pa) (LE pea)? 
dps Ja=1 N (He — fi)? 
Asa<1, we have (1 — p,) (1 + ap,) > 0 and 
@ (pe -- fy) ~ (1 — py) (1 + opty) = (opty — 1) + py (1 — 2) < 0, 
from which it follows that 
Fa 
= <0 
Apr cI 


The greatest value p, can take for our range — | to + 1 is 1, the minimum of 


wal Se) ee (86), 


for any pt, = 0. 


“ot must therefore be found for uw. = 1, for which value (86) passes into 


"Ge 
Lowi 1= 2 {204 foe 


o2 
which, since px, = 0, is a minimum and equals 2 — .2(1+ a?) when p, = 0. 


N 
The ¢ (x) distribution ought accordingly to consist of two equally big groups 
at the ends of the range and the actual distribution to be chosen for a function of the 
Jirst degree, the standard deviation of which is a linear function of the variable, should 
be two groups at the ends of the working range with numbers proportional to the squared 
standard deviations at these places. 


(8) For a continuous distribution from — 1 to 1 with frequencies proportional 
to the squared standard deviations we have 


Hy =0 and py = 5, 


v=1 2 2 
and the maximum = = (1 ats 3) 4, 
n\2 
the actual distribution is (x) = (isan) 2 ae 
en = ‘ 


Table V contains besides the maxima of o, from these two distributions those 
obtained from a distribution for which ¢ (a) is constant with two additional clusters 


at — 1 and | each consisting of : of the observations. 


The actual distribution is, since 


Mg =3+$=3 
_,_ (L+azx)? N 
POS are a7 ’ 
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d (l—a)? N ; 
. — observations at — | 
with Taeeat a 
2 . 
and pals ; a at + 1 in addition. 
1+ 202° 4 
The maximum of o%, is 
2 
oO 
= (1+ 2a*)&. 
wy (i+ se) 2 
TABLE V. 
Maximum of ‘ / N| Maximum of o, ve for 
a/N Maximum of o,, ~— o 
oy -—— for CEPR TA Js distribution with 
a o for distribution N 
best distri- | : N (x) =— and clusters | 
Waikion with ¢ (a) => 4 
ot at +1 
‘0 1-414 2-000 1-581 
‘1 1-421 2-003 1-587 
2 1-442 2-013 1-602 
3 1-477 2-030 1-628 
4 1-523 2-053 1-663 
5 1-581 2-082 1-708 
“6 1-649 2-117 1-761 
“7 1-726 2°157 1-821 
‘8 1-811 2-203 1-889 
‘9 1-903 2-254 1-962 


(9) For a function of the second degree we found in (5) that when the standard 
deviation of the observations was s, = o (1 + az?) and a = 3 it was advantageous 
to use the whole working range of observations, much more must this be the 
case when s, =o(1+ az) and0=a<1. We shall therefore try to find the three 
best groups of observations taken at — 1, v, and 1, supposing v unknown. We do 
not venture to assert that another form of distribution might not lead to a curve 
of standard deviation with lower maximum, but the solution of the general problem 
would involve a more elaborate investigation into the possible variations of 4, [42, 
fs and py, for distributions with limited range than seems desirable in this con- 


nection. We shall further limit our problem by assuming that the best distribution 
e=1 r=-1 
will be found among those which make o, = o;, and both also equal to a maximum 


situated between x=—1 and x=1. This would obviously be right if the 
maximum were found at x = v; this in fact is not the case, but still the maximum 
value is likely to be chiefly determined by the number of observations at 2 = v and 
there is therefore every reason to believe that our assumption is justifiable. 


Let there be Né observations at —1, N.y at land N(1—6d—y) atv. The 
interpolation formula of Lagrange then gives 


yee), , @-Y@+), aol 


ee ee! enone eo” 
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. from which we find 


ooo {f = 0) (@ = Te Sa) eae eee 
oe ee VE ea. Gy 
eat 
(®— 12° 1—8—y} 
a=1 a=-l1 
The condition for of = 

2 ae aNe 

is (1+ a)* _ (1~a) 
y 3) 


we=1 
Khminating 6 we obtain for o;} — o; the value 
. opty cage CSC reel) (a URE tas) (ne It) 
oN oe (1 + a)? — 2y (1 + o?) 
5, [ll + v?) a + Qu (1 — vo?) a + 2 — 54 v| 


or 


» 7 o& (1 +.a)2 (22 — 1) a ie 
N° PI + a= yl s 
+ Qu (1 — 0) [(1 + a)? — 2y (1 + a2)] a + (1 + a)? (2 — Be? + v1) 
— Qy [(d + 8) (2 — Set of) 4p ow)8)} ceca eee ( 


os 
Our assumption that the maximum o® shall be equal to o;, requires that the 
expression in curled brackets shall be a perfect square for which the condition is 


2 Fe x) tea + a?) v8 + 2a (1 + a2) v5 + (3 — a2 — 3a4) v4 — 4a, (3 + 202) v8 


+ (— 2+ 9a? + 5a4) v? + 2a (3 + a?) v — a? (3 4+ 2a?)} 


+ a to {— a?v§ — 2av5 + (— 5 + 2a) v4 + 1208 — (2 + 9a?) v2 — 2av 
+3 + 4a*} yt 202! he OU ae sar aaa nent Saree eee cee (88) 
ee on alee) oars ‘ ; 

Now of = an the maximum which we want to make as low as 


possible, hence we have for a certain a to find the v for which (1 a a)? as given by 


(88) is a maximum. 
We shall examine the cases a= -5 anda =°9. 
(10) For a=-5 (88) takes the form 
2 
| is {62508 + 2-5v5 + 5-125u4 — 1403 + 1-125? + 6-5u — 1-75} 


(1 +a)? 
*25v8 — v — 4-504 + 6v8 — 4-250? — » + 4} 4 v4+ Qo? 1=0, 


T 


Bais 
aaa 
which differentiated with regard to v gives 
Fe sae =) {3°75v8 + 12-5v4 + 20-5v8 — 42? + 2-25u + 6-5} 


1-5v5 — 5v* — 18v3 + 18y? — 8-5v — 1} + 4v (v? + 1) = 0. 
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We find that these two equations have for v = — -190 the root fal ea == 280 


in common which represents a maximum. 


The maximum of the curve is hence _ . 3-405, which value occurs for « = + 1 
and for x = -064 determined by (87). 
The distribution of observations is 
-6607 N at 1, 
-0734 .N at —1, 
and -2659 N at — -190. 


For comparison we shall consider what would result from taking for the ¢ (2) 
distribution three equally big groups of observations at — 1,0 and 1. This would 
for observations with the constant error o make the maximum of the curve equal 
o2 


to =,.3 and that multiplied by 


N 
3°5 
I+ 2apty + a? py = my 
2 
gives a 3.5. 


The actual distribution ¢ (x) would be 
6429 N at 1, 
0714 N at —1, 
and -2857 N at 0. 
This last distribution only makes the maximum o; about 3 per cent. greater 


than the value which we obtained by our special distribution and it will therefore 
for most practical cases be as useful. 


(11) When a=-9 we find for (87), 


2 
Fest {2°9322v8 + 65160 + -443404 — 33-26403 + 17-141 y? +. 13-716 — 7-4844} 
Qa 


*B1lu® — 1-85 — 3-38y4 + 10-803 — 9-290? — 1-89 + 6-24} 


pit 2y2-— 1—0, 


which differentiated with regard to v gives 


72 
faa {17-59320° + 32-5804 + 1-7736v? — 99-792v? + 34-2820 + 13-716} 
a 


aa ca {— 4-8605 — 9u4 — 13-5208 + 32-40? — 18-580 — 1-8} + dv (2+ 1) =O. 
Y 


For v = — :354 these two equations have the root (lta) = :23214 in common 
a 


which is therefore the maximum of sulesens 


(is\a)s 
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The maximum of the corresponding o, is hence 
o (l+a)? o@ 
a = ) = ay: 4308. 
From (87) we find that it occurs at 7 = -125 as well as ataw=+41. The dis- 
tribution of observations is then 
8380 N at 1, 
-0023 N at —1, 
and -1597 N at — -354. 

Comparing again with a distribution consisting of three groups of observations 
at — 1, 0 and 1 with frequencies proportional to the squared standard deviations at 
these places we find that the distribution would be 

‘7814 N at 1, 
0022 N at —1, 
and -2164 N at 0, 


and the maximum of o} would be 


We thus find that by our special distribution the maximum of o7 was 7 per cent. 
lower, the choice of that distribution would thus permit us to reduce the total 
number of observations at the same rate without raising the maximum of o?. 


(12) The result of these investigations is that the maximum oa, obtained from 
the best three groups of observations differs so little from that obtained from three groups 
at —1, 0 and | that the first grouping only in quite exceptional practice would be pre- 
ferred. 


We shall therefore in Table VI give the maximum o, arrived at from the 
following three distributions: (1) three groups of observations at — 1, 0 and 1 in 
numbers proportional to the squared standard deviations at these places, (2) a 


distribution for which ¢ (x) = a and (3) a distribution for which ¢ (x) = -2797 N 


with additional clusters -2203 NV at + 1 (see Table IT, p. 50). 

Both in Table V and in Table VI the difference between the two first maxima 
as a proportion of the first decreases with increasing a so that the distribution with 
uniform ¢ (zx) is more profitable for a > 0 than for observations with constant 
errors. 


VIII. Best distribution of observations for determining a single constant 
of the function. 


(1) Our choice of observations has hitherto aimed at giving within the working 
range of observations a determination of the function as accurate and uniform as 
possible. We shall now consider what is the best choice of observations for 


Kirstine Siri 723 
TABLE VI. 
| ae Os Maximum of pes: Maximum of pad pesca of 
| Te I 
e Pee from from distribution from distribution for) 9! ¢ aco 
three groups | : » _ |which ¢(x)=-2797N| best three 
at 0 aan ea for which $ (z)= 2 | and clusters at --1 groups 
‘0 1-732 3-000 1-862 — 
‘1 1-738 3-005 1-868 — 
2 1-755 3-020 1-886 — 
3 1-783 3-045 1-914 — 
“4 1-822 3-079 1-954 — 
5 1-871 3:122 2-003 1-845 
6 1-929 3-175 2-062 — 
Or | 1-995 3-236 2-129 — 
8 2-069 3°304 2-205 — 
9 2-149 3-381 2-287 2-076 


determining a single constant of the function. 


The investigations will be carried 


out for functions of the first and of the second degree for which the standard 
deviations of the observations are 


or 


S01 02), 
S,— od an); 


a>-—l 
Sc = 0: 


We have in (3) of Section I given the formula (8) for g;,, and shall here give only 


the form to which it is trans 


ferred by putting 


b (x) = kd (2) f (2), 


1b SE 


aeen | (x). f(a) da 
The formula analogous to that given for o% (66) is 
on evi Or 0 Opam Setar. tS este 0 
p o2 
0 Lee gl ig ees ee eee fen 
0 fee Eee [a 2EgaoE epee iii Pen4 
0 fe = bg USM roctence [po Giekioes nse |= 0 ...(89) 
1 Po Moy Mpta  <+--- (bPhj. ance n+p 
| O Un Benya Menta verre Mptn sre Hon 


(2) 


For a function of the 


jirst degree 


Y = A + Oy % 


for which the standard deviation of an observation is 


and therefore 


8, =o (1+ az’), 


1 
k: => 1 = ae 2Ofty + 


a>—l, 


aig, 


74 Choice in the Distribution of Observations 


we find, according to (89), 


Ca Ge (1 + 2apy. + a4) (1 qe = Hie 2) wacisiotseiteicieen seit (90), 
ees ba — phy 
a 1 
and OF, = a7 (1 + Zap, + a2u,) 5. sae eddies tiger ied REE 91). 
N ( be Ma jig — ( 


As for any skew distribution of observations we can find a corresponding 
symmetrical distribution with the same jz and py, both these expressions are a 
minimum for p, = 0. 


We have already shown in (2) of Section VII that any possible values of uw, and 
[44 can be produced by three symmetrical groups of observations, so that by intro- 
ducing the variables v and y determined by 


[n= vy, 
and pg = v'y, 
and limited by OP ally 
OS yaw; 


we do not leave out any possibilities. 


From (90) we then get 


2 
9 CG 
C= (1 + 2ayv? + a%yv'), 
2 
. . mes - Oo 
which for a>0 is a minimum when y = v?= 0, and for a= 0 is WN for any y 
and v2, 
For a < 0 we find, since 
(fo pos 1 
—% == — Jay (1+ av?) and v<—-, 
ARR NE ee ( ) ae: 


that for a constant y, o7, has the least value when v? is as great as possible, that » 
is' for v= 1. 


The minimum of o7, is then 
2 
o1, = wilt (2+ a) ay}, 


which, since a (2 + a) < 0, is a minimum when y takes its greatest possible value 1. 


The minimum is thus 

af, =F (1 +a) 
Hence we conclude that: 
o2 
N 
o2 


N 


when a > 0, oj, is a minimum and equal to =; for NV observations at x = 0, 


when a = 0, of, 18 a Minimum and equal to 5, for any distribution for which py, = 0, 
and 

a: ee co? 
when a <0, o;, is a minimum and equal to 


N (1+ a)? for two equally big groups 


of observations at + l. 
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(3) When we introduce p, = 0, py = yv? and pry = yo" in (91) we get 


a = e (1+ 2ayv? + a*yv') zo : 
eae ly yw 


This for constant v? is a minimum when y = | and then equal to 


ue T8 Daa2 ana) 2 99 
Ou, = 7 ( + Zav + atel) 72 eee ree ec ccc cesveereerecees (92). 
dope 302 (a wl 
ER dv N (« 4 a) j 


1 ; . , ae 5 
v? = +— when possible, that is for a $1 determines a minimum, while for 
a 


9 


a<1, o%, reaches its lowest value for v?=1. From (92) we find for a =1 the 
minimum 


a= a . da, 
and for a <1 the minimum 
2 
= 7 (1+ a)’, 
2 
both formulae giving o2, =. 4 fora=1. 


a N 


Our results are accordingly : 
2 


: but oO ° 
is @ minimum and equal to N° 4a for two equally big groups of 


observations at v= ee or for any distribution with the same py and py, 
Qa 


2 
ay 


when a> 1, o 


2 
2 
ay 


. Oo oO 
is a minimum and equal to 


and when a=1,o W (1+ a)? for two equally big 


groups of observations at «= + 1. 


We see that for a = 0 two equally big groups of observations at + 1 make both 
o,, and a7, minima and these groups in addition form the distribution for which o? 
has the lowest maximum within the possible range of observations. 


(4) For a function of the second degree 
Y = A+ 2+ a2", 
with the standard deviations of observations 
S7—o (baz), aS—T1, 


= 1+ 2apy+ a*py, 


and therefore ; 


we find. from (89) 


[abla — p23 


2 
Oo 
OF, = a (14+ 2apry + a4) . = —— aaa Staibisto hs}. 
w' a OT tay — Ws — p+ py pops — Hig 99) 
ere ba = pS 
04, = a7 (1+ 2apy.+ a*u,). a dase es 94), 
N a Pa taba — Ha BS 2p ably — Haha eS 
5 Ge b= Mi 
and oO. = a (14+ 2ap, + a2 OS —— demmuouauelts))), 
aN a ms Pabla — pe — pes + 2p Pals — Mi ba ; 
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We shall prove that the last factor of each of these formulae is a minimum for 
ba = Hg = 0. 
To prove this for (93) we consider the difference 
Boba bs zh Hs (Maps — Hapa)” <7) 
Popa — by P53 — 2by Peds + Miba Me (Ma — £2) [(Ms — Mabe)? + pit (Ma — pe) ] 
from which follows 


Hoba — [3 se bs 
Peably — Ba — BR + 2p Mobs — Piola Moba— 2 M3 — 2pyMebs + Miba 
For (94) it is at once clear that 
Ha — be ae 
Popa — Ba — U(bs — Mabe)” + ei (oa — ba) ota — BS 


For the case of (95) we compare 


peste ee ht 7 pat te Ne 
Hopa — Hy Hs — SHyMebs + Pipa 
and we find the difference 
il 1 
ae ia 
ies ws — 15 + (#2 — po) 
by 
and hence 
oe He bi 


Motta — HS — pS + ppt — Palla Matta — PP — Dea beats + fa 
It is thus proved for the three formulae that a distribution of observations for 
which pp, = ps = 0 gives lower values than any distribution with the same p, and 
fy as the former and with p, = 0, ps 2 0. 


Hence our problem is reduced to finding the p, and pz, which make the following 
expressions minima: 


oa bh 
OF, = a7 (1 + Qapeg + a2 pg) A oes eeeee eevee eens (96), 
N iD ba — 8 
) o (a2 214) (97) 
Oo, = wll t+ cap, +a ae Sec ated slnia Sie laynisteleleleyste eerste 5 
N lid Ma it 
9 o (+2 211,) (98) 
Oa, = a7 (i + sap, +0 EE ee eanannaeio0 
2 WN Me Ma [ty — po 
(5) Introducing py = yr? and p, = yv* in (96) we get 
2 
on =H (1 ae a any) 
2 
which is seen to be > F except when y = 0. 
2 
Hence the minimum value of o%. =~ can only be obtained by taking all the 


o N 


observations at «= 0. 


(97) is identical with (91) for 4,=0. The conditions for a minimum of o%, are 
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therefore the same for a function of the second degree as for a function of the 
9 


ay 


: th. ot oO 
isa minimum and equal to 


a 


t Palette 5 
equally big groups of observations at « = + —, or for any distribution with the same 
a 


first degree. That is, when a>l1,o . 4a for two 


2 
o2 


N 


fg and py, and when a = 1, a7, is a minimum and equal to =, (1 + a)? for two equally 


big groups of observations at x = + 1. 


With the variates y and v (98) takes the form 


| UL eo ne en eee 
RemeNi e H8y ( Sy)" 
By differentiating with regard to v? we get 
2 2 9 
i ee (1 Gyo) 


de N'y(1—y)v8 
which is negative for any a, v and y within our limits. 


For constant y, o7, is therefore least when v? = 1 and the minimum value is 


(ty 


Fem Ot (: ) 1 
oy, = fee Rte Chai Wer vaisalacl sy atvmisi aa aysea las oe (99). 
2 N y q i] 1 cm y 
This is again a minimum when 
doo" 1 


= _ Dial Hels Oy aly § 
me rayne ty 0 
that is for y = 5 aha which gives a minimum both for positive and negative a. 
4 Qa 


Thus the distribution that makes o7, a minimum has a ¢ (x)-distribution 


aes N : l+a ; 
consisting of 575—— observations at — 1 and 1 and 5 — N observations at 0. 


2(2+a) +a 
1 
We have Ha Bao 
1 
d = : 
an k (1+ a) 
The relation wb (x) = kd (a) f (x) 
i N 
i} = 
then gives us wb (0) Oem 
l+a 
and PACE) pause) 
From (99) we find the minimum value 
o, = = (2 + a). 
2 
Our result is thus that 07, is a minimum and equal to . (2 + a)? for a distribution 
consisting of 9 ‘ = observations at 0 and TCE Natt. 
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(6) When the standard deviation of an observation is 
s,=oa(l+az) and 0Sa<l, 


we have ; =1+4 2ap, + a%ys, 
and according to (89) we find for a function of the first degree 
9 co (1 ; 2a az ) be (100) 
Oa, = >> im 74) ROOD OCUIGOSE DOs 0D0 006 
N M1 Me Lp — pe 
al pi Comma aaah (101) 
é C7, = a7 20,4, + = MEduoseesbaddond6a0ceds : 
N My Me tn — BS 


By differentiating (100) we find 


doy, — 0 2a (1 + apy) (oa + af) 


du, N (Hz — pa)? 
a doy, 0? (apr, — Zaps — fy) (Ma + afte) 
an MeN Wa —- 
pg (Hz — Hi) 
Both of these can only be zero when 
freA 12 | Re HERARATANERARRE A AGcci6 Jo0500 (102), 
2 
which is seen to determine a minimum of o;, the value of which is a The 
condition pf. = — PY can be fulfilled by an infinity of different distributions. From 
Qa 
OZ pz l 
follows the condition 05m Sa. 


We shall confine our attention to those distributions which consist of two groups 
of observations. Let there be Ny observations at v, and (1 — y) N at v,, we then 
have 

fa = Uz + ¥- (01 — 0), 
py = 3+ y (0? — 08), 
from which by means of (102) is found 


Y boy 1 


= (LFan) (+ an)” mya Fa) 
1 (1 + av,) (1 + ave) 
a i = om AW) Noe] her 
and k + apy AG a 
Thus we find that the ¢ (z)-distribution consists of 


= vy (1 + a») 
= ie N “yy 
™—H)ftam+a) 9 
v, (1 + av,) 
and pes eee -N at vp, 
(vy — %) {1 + a (v1 + %)} z 


while the actual distribution 
ye sh Oh ae 25) = Jaane 
HN?) ise ang Klee) I a 
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Be a aleea: 
consists of = 02 (1+ ar) 7 ot 0] 
Vy = Vg 
Braet Pe Meee ot cist (103). 
and ets Neat 02 | 
V1 — V2 


We thus see that for any two points v, and v2 of which one is negative and the other 


9 
Gd . 9 o~ . 
positive we can choose the numbers of observations so as to make oj, = >, as it of 


N 
course would be by taking a single group of observations at «= 0. 
(7) By differentiating (101) we get 
dot a" 2 
a = = 5. (1 + apy) (My + Apia) cercessccssereeees 104) 
ihn Ne ara Si 
do®, o7 (1 + apy)? 


and Bes re 
Apt NV (fy — pa)? 


As the latter is always negative o%, is for constant ju, least when j. has its 
greatest value, that is 1. 


Introducing this in (104) we get as condition for a minimum, 
py ta= 0. 
There is only one distribution for which p, = | and pp, = —a, and it is that 
consisting of two groups of observations at — | and | included in the distributions 
examined in (6). 


From (103) we find that the actual distribution consists of i = = N observations 
I+a ne er ; co? 
at — 1 and 9 N atl. The minimum of o7, is from (101) found to be N° 


o2 
N 
tions at the limits of the range with numbers proportional to the standard deviation 
of observations at these places. This distribution makes also o?, a minimum, but it 
vs not, except when a = 0, the distribution which gives a, the lowest maximum value 
within the possible range of observations. 


The minimum >, of o7,, can thus only be obtained by taking two groups of observa- 


(8) For a function of the second degree, 
Y = A) + 2+ a,x 
with the standard deviation 
o,=oa(1+az), 


where Oo 
we have : = 1+ 2apy + aus, 
and from (89), 
eee Peo ply — 3 
Fa = 7 (1 + 2apy + a? U2) — ae =— ...(105 
na : Beate — BS f+ Dp flag — pi peg (105), 
o hie 
Or, =a (lt+ Zap + ag) Ps te —— ...(106 
iy Mba — Py — 3 + 2py eos — [ei fg oy 
sae Pasa 
oa, = (1 20.4 a" ) Z i a ae Se ea a siete 107 3 
: N Habla — HE — as + 2} Mobs — [i beg oe 
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(105) may be brought into the form 


( 6 By 2 1 3 
Gee fe» (Hotty — Bs) (a 1 = ip (12 — peo)? | 
o.,, So 1 + RE TE 5 
N baba — bey — 3 + 2p Me Mg — Mig 
where the denominator and pyp4 — aa are always positive. Hence the condition 


for o7, taking its minimum value ~ N° 1 
fy + Optg =O and py —~ py ps = 0 
1 
or Oe A ee (108). 
(PA fea a 


We shall examine the possible distributions consisting of three groups of 
observations with the frequencies y,, y, and y; at v,, vg and v3. The conditions 
(108) require 
vat + yath+ vot _ vith + etl + vot _ ya0l (oe) + ys0h (My — ey) 1 
Vii t Yo. + Yass Vai t Yor +7353 Y2U2(V2— 4) + 33 (Vp9—%) 

Yi%r1 (1 + ay) Yat, (1 + avy) _ Yas (1 + avs) 
Vg — Ug 3 — Vy — Vg 


or 


v. U; (OP o 
Now —?!— , —2~— and ——?~— can never all have the same sign and (1+ av 
Wis On CaO V1 — Vy 2 
2 3 3 Al 1 4 


is for any v 2—1 positive, from which it follows that (109) leads to. negative 
frequencies. Nor can (109) be satisfied by two groups of observations as y, = 0 
requires Vv, = v3; = 0, that is one group of observations at x = 0 which of course 


o2 


gives = 


lo N° 


(9) We may write (106) 
pokes o (- iL (ts — p5) (Ma + Ope)? + (Mg Huta!) 
ea He be (Ma — }2) (te aa Hi) = (3 HaHa)? 
where the last ratio is seen to be positive unless 
My + Aplg = O and Bs — Hib2 = 0) cece cece cece ccc ecnccs (110). 


If therefore any distribution of observations can give — its minimum value 1 
Me 


and at the same time fulfil those conditions it will make of, a minimum and equal 


2 
to. But pr = 1 together with (110) lead to 


N° 
v3 = -i= — 4, 
: : l+a ; l-a 
which require Ser N observations at — 1 and ae tl at 1, whereas the actual 
’ i= f 1 
distribution must consist of ——N observations at — 1 and ae N at 1. 


2 
Thus the only distribution which makes of, a nunimum and equal to is that 


N 
— 3 I 
consisting of seats N observations at —1 and +2N at 1. 
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(10) The general minimum conditions for o,, cannot be found without more 
elaborate investigations into the possible variations of the moment coefficients 
than are at present available and we shall limit our research to the case of three 
groups of observations. 


Let us suppose y, V, y, N and (1 —y, —y.) N observations taken at x, 2, 
and «,, and let the corresponding means be 7, #/, and 75. 
We then find, when 


A = (2&1 — Xp) (%_ — 23) (3 — 2), 


il 3 = i 
Oa {1 (%3 — Xa) + Yo (#1 — X3) + Ys (2 — %)}, 


i i 


; o {(s = %)*(L+ aay)? | (Gy, — Hs)? (1+ am)? | (4 — 2%)? (1+ “a5 
“1 Ye DS Vie V3 


2 A2 N a 


Differentiations first with regard to y, and then with regard to y, give the 


minimum conditions 
vi = v (CS yi ye) 


(%3 — %)* (1+ aay)? (%,— a3)? (1 + amy)? (%_ — @) (1 + ag)?’ 


or, when we suppose 2, < % <3, 


al me =I = See eye) = i 
(%3—%,)(1+a%,) (,—23)(1+am,) (%2—2)(1+am,) 2 (x3— 2) (1+ ax) 
sete, AO (112). 
With these values for y, and y, we get from (111) 
, 0 (2(m—m) (Lam) of (21+ ary 
aN ee } N (22 — 2) (#3 — “at 
This for constant x, is obviously a minimum for x, = — 1 and x, = 1 and is then 
equal to 
Pe Gaal 2a Glas) 
ot. | Teele 


From this we find 
doa, o 2 (ax, + 2x, + a) 


ji. VN (=aP ” 


1 1 
an ae 


2 
5 co 
—— 


12 N 
and the frequencies found from (112) are 


7Vi-a(vita—Vi-a).N atl, 


which shows that 
determines a minimum. 


The minimum value is 


(L+V1 — a), 
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QV! TERN V AGE Ve), WP aT, 


and LN at == Wl Se); 
IX. Adjustment with regard to both of two variates connected by 
a linear relation. 

(1) The case often occurs when both of the variates observed have errors of 
observations of the same order so that adjustment only of one of them is unsatis- 
factory. We shall therefore in this section consider adjustment with regard to 
both of the variates and give the adjusted relation between them and the standard 
deviations of the constants. 


Let x’ be observed with the standard deviation Vac and y’ with the standard 


deviation Vyo, we shall then for the sake of greater perspicuity exchange the 


/ i? 
° , av U b 
variates for « = —- and y= + so that both of our variates have the same 


Va vy 


standard deviation o. Let . x {x"y*} taken over the N pairs of observations be 


denoted by p,,,, we then find, by adjusting only the y’s according to (3), 


jie x 
Mor 1 pao |= 9, 
Mar Pio P20 | 
or Y— Pou = eth UE (0 = peyg) : tnena rane gente (113) 
pert) = Yat) 
By adjusting only the z's we get 
Loo — Hon 
a = — —— (a ) cats saan (114), 
na Mir — Foi10 ( ae 


which only coincide with (113) when 


(H20 — io) (Moz — Hix) = (#11 — HorHa0)”» 
that is when there is perfect correlation between x and y and no casual errors of 
observation. 


(2) Adjusting at the same time with regard to # and y may be transformed to 
the problem of finding the straight line for which the sum of the squared distances 
of the observed points (x, y) is a minimum. 

Let the line sought be 

xcosu+ysinv+ p= 0. 

The sum which we want to make a minimum is then 

SS = [yy COS? V + py: Sin? v + 24, COS V Sin V + 2ppyy COS V + Zppy, Sin Vv + p?, 
ap = 0 requires P = — fy COS V — po, SIN V, 
indicating that the line passes through the mean (49, Mo); this determines a 
minimum for constant v, 
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The corresponding S is 
S = (tao — Mio) COS? V + (yg — fay) SIN? UV + 2 (1441 — fo fy) COS V SiN V...... (115), 
which differentiated with regard to v gives 


dS ; gies ‘ 
aa {H20 — Mio — (Hoz — Poor) $ Sin 20 + 2 (1441 — Poi fro) COS 20. 
It thus follows that 
tan 2v = z (Hen — Horo) ees Bae 2 , 
M20 — Pio — (Moz — Min) = 1 — tan*v 
or 
Te 2 {loz — Mon — (Meo — Mio) = V [M02 = Po = (H20 = Pio) P + 4[ba ae LoHo1] "3 


Mii — FoiF10 


determine a maximum and a minimum of S. 

Substituting in (115) we find 

S = 3 {H20 — Mio + Moz — Bor = V [p20 = [io — (Moz — Mor)? + 4 [ea — Morro] 
so that the minimum corresponds to the negative sign of the root in (116). 


The adjusted function connecting « and y is hence a line through the general 
mean forming an angle uv with the x-axis which is determined by 


tan u = — cotv = es 
2 (Pu — Ho1!10) 


For the variates x’ and y’ there must to this value of the tangent be added the 


factor w 2, expressed by the moment coefficients of x’ and y’ we therefore find 


(fy — $21) — Y (fea — Bad) + -V [ey (edo — ped) — (theo — feo) |? + dary [pts — poor fio] 
2a (Me a Poikio) 


tanu = 


(3) We shall prove that the line is situated between the two regression curves 
(113) and (114). 

Making (49, 491) the zero point of the coordinates, the three tangents to be com- 
pared are 


Mir Fog 1 Ras ea 
» and 5— {Ho2— Hao + V 24 dui} = tan u, 
Moo Pe Saas, Si (M02 — Pao)? + Ayn 

where the p’s now are the moment coefficients about the mean. 


According to p, 2 0 we have 


eat Pe od 
Peo Pu 
since Hit < Poo» Mog 
As V (poe — flog)? + 4s < Moe + Heo» 
we have tan u < 4°, 
Mi 


6—2 
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It rests to compare tan uw and Mu we find 
M20 Re 5 2a 
Mi 1 { Quit ll all ae 
tan u ——* = ~=— {ugg — flog — —— + —Pey—- —— | +— —pin)e- 
joes Deal ae M20 ae Koz — 20 ee U3, (Hozt2o — Mn 
The factor in curled brackets is hence positive and we have tanw>or < ne 
M20 
according as 2 (eS Oe <0, 


we have thus proved that 


Miu Ho2 
— £ tanus —. 
M20 Miy 


(4) In order to find the standard deviations of the constants of the line we ~ 
shall express the observations, the standard deviations of which are Vao and 
yo, by a parameter r to get an equation for each observation. 

Suppose b;,=a+ 7; C08 u, 

yY,=b+7;sin u, 
and suppose we have a good approximation for a, b, uw, 71, 79 ...... ry from which is 


calculated « and y corresponding to the observations. The differences between 
observed and calculated z and y can then be expressed by 


Az, = Aa—r;sinu. Au + cos uw. Ar;) 

Ay; = Ab + r;cosu. Au + sin wu. Ar;} 
and we can carry out an adjustment, Aa, Ab, Au, Ar,, Ary... Ary being the 
elements. 


The normal equations are: 


1 N sin u cosu cosu 
Sip Aa HO. Ab 27 he Aeree Ar a eee tA 

a a a a Qa 

] N cosu sin u sin u 

— = {y,;} =0. Aa + — Ab4+% {r;} —— Au+ - Ary +... + Ary, 

8 ay: Df ey; 


_ sin wu cosu 
= {r[ - a tet =a avi |! 


sin w “ cosu a >{sin?w cos? Hie nal ; 
= — Z {r,} Aa + {7;} - Ab +3 {rij2] — 4 SON | autn — ——)cosusinuAr, +... 
a ec y y « 
+ry(— ——)cosusinudAry, 
“yaa: 
COS U sin wu 
a de OY 
vee 2 
COS u sin w L costa — sin? wu 
= A set abt (< - ) coswsin udu +( an +0. Arys 
COS U sin u 
— Ary Ay y 


87 sin’ i - dl 5 cos?u — sin? wu 
eae Noise Y Ad +1ry (; - *) cosusinuAu+0. Ary +... 4+ ec + ) Ary 
a Y y «a a GY 


Eliminating 7,, 7 ... 7y from the first and the third of these equations by means 
of the last N equations, we obtain 
{sin wAx,; — cos uAy;} = N sin uAa — N cos uAb — & {7} Au...... (120), 


and 


X {r; [sin uAx,; — cos uAy,]} = X {7} sin uAa — & {r;} cos uAb — & {77} Au...(121). 
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By eliminating the 7’s from the second of the normal equations we get an equation 
identical with (120), which shows that we have one more element than we can 


determine. 
From (129) and (121) we are however able to find 
(sin wAa — cosuAb) and Au; we get 


sin wAa — cos wAb = nena) X {(mMg — m,7;) (sin uAx, — cos uAy;)} 
and Ma = WA a x {(m, —7;) (sin uAx, — cos uAy,)}, 
1 1 5 
where m= 7 x ir} and m= N & (7%). 


For a point of the adjusted line corresponding to 7, we find, according to (119), 


Pq = sin uAx, — cos uAy, = sin uAa — cos uAb — 1, Au. 


The standard deviation of p, is seen to be the standard deviation of the position 
of the adjusted point (#,, y,) in the direction at right angles to the line. 


We find 


Dy = Te x {lm — m7; — 7p (Mm, — 7;)] (sin uAw, — cos uAy;)} 
UD a5 1 
2 © (asin? (fp =m)” 
and =H (a sin? u + y cos? w) \! + ern 


This standard deviation is quite analogous to that obtained for an adjusted 
ordinate when the abscissa is errorless and gives the same indications for the dis- 
tribution of the observations. 

For o,, we find 

» _o* (a sin? u + y cos? u) 
EAN (mz — mi) ; 
again emphasising that the standard deviation of the 7’s ought to be a maximum 
to give the best determination of the line. 


In conclusion I should like to express my thanks to Miss H. Gertrude Jones 
for the care she has devoted to the preparation of the diagrams in this paper. 


ON THE PRODUCT-MOMENTS OF VARIOUS ORDERS OF 
THE NORMAL CORRELATION SURFACE OF TWO 
VARIATES. 


By K. PEARSON anp A. W. YOUNG. 


(1) In several recent investigations we have found it desirable to have the 
values of product-moment coefficients about the mean of the normal correlation 
surface. The present paper deals with the case of two variates. If the correlation 
surface be f 


N OE Eee OPEV UUs 
ee Se oh 22 ANG es eons, oF (i) 
270,0,V 1 — r ‘ 


where o, and o, are the standard deviations of the two variates x and y and ¢ their 
correlation, then we define the sth-ith product-moment coefficient to be 


eS ee aaa % 
Qe= az | | DS yt ede CY i.e. ccc wheels cece eee (ii). 


Further we write Ds P= Ga4/(ClCi) Bait (iii), 


so that p,,1s a purely numerical quantity and a function of the variable r only. 
Clearly from the symmetry of the surface 


Pos, at+1 = Pasa, at = 9. 
We are accordingly only concerned with cases in which s + ¢ is even. 

We propose therefore first to give the general algebraical expressions for the 
lower values of p,;, and secondly to provide tables for the numerical values of 
these product moments proceeding by increments of -05 in r. 

Since s + ¢ must be even if p,, be not zero, it follows that s and ¢ must either 
both be even or both be odd. In the former case p, ; does not alter when 7 changes 
sign; in the latter case p, , for negative r is simply p, ; for positive r with the sign 
changed. It is accordingly only needful to table p, , for positive values of 7. 

For the purpose of testing computations the following formulae are of value: 


0.22 (S$ tt—1) rp t= 1)@ = DiC =) pee ee (iv), 
Pat — (6 91) Ds ¢22 te SHP emi e 1) — NS a) sate al ts Oona eee (v). 
Or, again we may write sso AST) Tig age cosine coke oh ane cee ee eee (vi), 
and we have res (a IG Meet e rr sotonGamedtodhcadonacoos (vil), 


which is capable of numerical evaluation in a single machine operation. 
The general values for any normal product-moment coefficient are 


a 2s | Dp ust ( (2r)2 ai 
Pes, 2t = Ost ae ls SC Suomi eee (vil), 
m r(2s+ 1)! (2¢+ 1)! ¥at ( (2r)24 ; 
Pasi, 2t+1 Where ee IG2WECSaN Cre eo 
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(2) We are now in a position to set down the algebraical values of the product- 
moment coefficients. 
(a) sort=0. (ey) =O (yeti ABS) Goro (2¢ — 1), 
Poo=1, Pao=l, Pao=3, Poo=15, Pso=105, Poo = 945. 
These are of course the simple moment coefficients of the normal curve when 
the unit of abscissal leneth is the standard deviation. 
(OS LOb — ol, a Pap enti— ey seq) soa Olsiere (ot l)in, 
Vii =) P34 = 98P, . P51 = 157, “pa4= 1057, po. = 945r. 
Generally p.;1,1/Poi,9 = 7 and provides a means of finding r and testing how 
far the correlation of two variates is normal. 
(G)e SiO t 2" 
Deo ore lin ls Olntscs. (2b — We 23}, 
Pei Pg lO (1 47), o¢ = Loi(1 + -67°); 
Peg = 105 (1+ 872), Po, = 945 (1 + 10/2). 
(ins or’ — 3. 
Dee Deal 0 Omen (al + Dig qo trey, 
Do,9= 37 (3+ 277), ps5= 1ldr(34 477), pz3.7= 1057 (3 + 677), p39 = 9457 (3 + 877). 
(e) sort=4. 
On Pera leon Dns. (2b— 1) (3 1262+ -4¢ (6 — 1) o}, 
Paa=3(3 + 2477+ 8rt), p46 = 15(3 4 367? + 2474), 
Dag = 105.(8 + 4872+ 4874), 4 19 = 940 (3 + 607? + 80r4). 
Ga sion’ — 5. 
Desi —Porag — dD s..... (2b 1) er {lb + 202 + 4t (¢ — 1) 7, 
5,5 = 15r (15+ 4077+ 874), 5,7, = 105r (15 + 60r? + 2474), 
Ds, 9 = 945r (15 + 807? + 4874). 


(g) s ort = 6. 
De, ot — Poe—1.3.5...... (26 — 1) {15 + 908? + 60d (é — 1) r* + 8¢ (é — 1) (¢ — 2) 78} 
De,6 = 15 (15 + 2707? + 36074 + 487%), og g = 105 (15 + 3607? + 720r4 + 19278), 
6,10 = 945 (15 + 4507? + 120074 + 4807°). 


(hk) sort =7. 
idee Se oe eee OE 
{105 + 2100? + 84¢ (¢ — 1) r4 + 8¢ (t— 1) (t— 2) 1}, 
Par = 1057 (105 + 6307? + 50474 + 4879), 
7,9 = 945r (105 + 8407? + 100874 + 19278). 


(1) sort=8. 
Meat = Pere =. 2 O'.--.2(at— 1) {105 + 8407? + 8402. (¢ — 1) r* 
+ 224¢ (¢ — 1) (¢ — 2) v6 + 16¢ (t — 1) (¢ — 2) (t — 3) 78, 
Ps, 3 = 105 (105 + 33607? + 10080r4 + 5376r4 + 38478), 
Ds, 19 = 945 (105 + 42007? + 1680074 + 1344076 + 192078). 
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(4)e stoned — "9: 
Dot Poro le eee (2¢ + 1) 7 {945 + 2520tr? + 1512¢ (¢ — 1) r4 
+ 288¢ (t — 1) (t — 2) r® + 16¢ (t — 1) (¢ — 2) (t — 3) v8}, 
Do, 9 = 945r (945 + 100807? + 1814474 + 691276 + 38478). 
(4) es ore — 10: 
10,21 = Pot, 9 = 1.3.5...... (2¢ — 1) {945 + 9450dr? + 126008 (¢ — 1) 74 
+ 5040¢ (¢ — 1) (¢ — 2) 6 + 7208 (¢ — 1) (t — 2) (t — 8) v8 
+ 82¢ (¢ — 1) (t — 2) (t — 3) (— 4) 7°}, 
Pro, 10 = 945 (945 + 47250r? + 252000r4 + 30240078 + 86400078 + 384071). 
The table on pp. 90-1 gives the numerical values of these coefficients. We 
proceed to illustrate their use. 


Illustration I. In discussing the relation of auricular height (y) with age (z) 
of a girl’s head a sample of 2272 individuals was found to provide the following 
product-moment coefficients: 

G4 Olof 12; 3,1 = 74:447,616, 

Yo,1 = — 1-957,022, Ya,1 = — 108-701,559. 
Are these incompatible with normal correlation? (See K. Pearson, On the General 
Theory of Skew Correlation and non-linear Regression, Drapers’ Company Research 
Memoirs, Biometric Series II, p. 35.) We have 

o, = 3:064,819, Oy = 3:454,125, 

and y = -294,128, 
and the leading subscript above corresponds to the x coordinate. We need first 


the values of qo1, 93,1 and q4, on the hypothesis of normality. Clearly q., and qq, 
will be zero, and using linear interpolation: 


93,1 = Fx Cy P's1 
= 99-437,979 x -88256 
= 87-759,983 = 87-7600, say. 

In the next place we require the probable errors of these q’s. The general 
expression for the probable error of a product-moment about the mean is given 
in Biometrika, Vol. tx, p. 38. In our present notation it is 

No? 4 = 920,0t — Ve 1 3 do,0Psa,t +P G0, 29s, ta 
+ 28t 94,1951, Ys, t-1 — 284541, tYs—1, 1 — 2EYs, 1419s, t-1- 

Now remembering that for a normal distribution q vanishes when s + ¢ is odd, 

and that q45 = 30,*, while éant me 


ds, i= 10 


2 Mey 6 
P S; tox Os 
we have 
No*,,,, = G4,2 + 492,0971,1 + qo, 2 772,0 + 4971,1 92,0 — 493,191,1 — 292, 2.%2,0 
= On iOyt (LOD ag + Of tl ane dee iows: 


(oe 


Oy , 9 , ¢ , 4 
Gp eS VN {10p'..9: 4 87? + 1 = 4r pe — 2p ooh seeeasne een ee ee es (x) 
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No?,,,, = 6,2 — 3,1 = Fa Oy (10096, 2 — p's,1)» 
Oy, ! 19 4 1) 
= 4/N {100p 6, 2 = Pp 3,13 iolslarstakeletctcleievarelecelciele/sicle:eisi=icle'aiacenicleleis7s.4)p\0:e-0:0.0\s'eis) aisle (@ . 


No®,,,, = 3,2 — 14,1 + 1692,0973,1 + 2, 2974,0 
+ 841,193,194,0 — 895,193,1 — 244,914,0 
= o,80,7 {1000 p's 5 + 16p"34 +9 4+ 247’, 1 — 80P'5,1P'3,1 — 50P's of, 


One 


oN {L000p'g » ar 16p"54 at p 5 80p'5.1P's1 a 60D", a} 


PNticecn te (xii). 


9a, 1 


Gq, 1 = 


We require accordingly to determine the following p’s: p's,2, P's,15 P's,2> P’s,1> 
p's. and p's» by aid of our table with second differences or direct calculation from 
the algebraic values in terms of r. We have 

P's, = 1:173,0226, p's, = °782,3840, p's. = *403,8135, 
p's1= 4411920, p’g,9 = -227,86002, p's 2 = +177, 6695. 


Also -6744898/V N = -014,1505. 
Substituting we find the following probable errors: 
Pais of | — -O12,926; 


PWR Ob gs — -1-20,651, 
P.E. of qs,1 = 6°625,903, 
121 De Ole Ufo a DIPANT(olsio 
We can now sum up our results for these data: 
yr = -2941 + -0129, 
9o,1 = 0+ -7206, 
93,1 = 87-7600 + 6-6259, 
da1 = 0 + 51-2677. 

The probable errors would have been to some extent modified had we been able 

to calculate them on the true and not the observed r. We have 
AQ, 1/P.E. of go. = — 2-716, 
Aqs,3/P.H. of 93,4 = — 2-009, 
Ag, ,/P-E. of Gaq= — 2°120. 

Thus none of the deviations are excessive in terms of their probable errors. 
The system accordingly does not diverge very widely from the normal. At the 
same time the deviations are all in one sense, i.e. in defect of the normal value. and 
are all greater than twice the probable error. It appears therefore probable that 
there is some significant if slight deviation from normal correlation in the growth 
of the auricular height. 


Illustration II. For the correlation of the contemporaneous barometric heights 
at Laudale and Southampton the following values have been found: : 
Southampton (z) o, = 3:250,067) r = -780,225, 
Laudale (¥) = nein = 2922. 
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The f’s of the marginal distributions show a markedly skew and non-normal 
system. The regression is, however, closely linear. Discuss the values of the 
product-moment coefficients : 

Jo,1 = 11-919,404, 
91,2 = 15-598,613, 
Yo, 2 = 401-523,496. 
For a normal system with the above correlation coefficient we should have: 
J21= %1,2= 0 and g.. = o,7%0,7 p's 5 = 362-192,761. 
Thus Aqe,1 = 11-919,404, 
Aqi,2 = 15-598,613, 
Aqs,5 = 397330, 730. 

We require to consider the probable errors of the q’s, which are given by 

-6744898 times the following standard deviations: 


Coa oN RO are el a el a — oe a 
_ FO," ' .2 ’ Dn 
OC Sam VINE ee aa tO fa at li aI Dieter 2 aoa ae 
2 
oa TN {1009's p— pa a. ce (xiii) 
We determine for the above value of r: 
Do y — 2217, 0021, Dig =P a =e oA OOO! 
P's, 4 = P's,2 = 1-030,50246, P's4 = °617,239437. 
Our results for a normal distribution are: 
Pan eot i) se O04. S525 


Pot ¢,7— 21-0914 73; 

P.E, of q;,9 = 1-320,585, 

P.E. of goo = 15-360,681. 

Hence NGp a) bets OMG —l0s920% 
AQs) 5) PBs Ot g455 1 1-82 

Ago, o/P.H. of G2. = 2-560. 

The deviations in the higher moment coefficients are at once seen to be markedly 
significant. But it will be noted that g,, as in the previous case does not differ so 
markedly in value from the normal as the odd moment coefficients. It seems there- 
fore likely, when a distribution is markedly skew, but the regression linear, that 
the even-even product-moment coefficients will not differ widely from the normal 
values, but that the even-odd ones will do so. It is possible that this is related to 
the fact that in distributions (such as 3 x 3 tables) which can be reduced in various 
ways to a tetrachoric table, correlation calculated from regression line diagonal 
cells is usually far more accurate than correlation calculated from non-regression 
line diagonal cells. 

Equations (x) to (xii) are of value beyond the present illustrations. Further 
uses of the above formulae and tables are provided in a memoir on ‘Generalised 
Tchebycheff Theorems”? which will shortly be published. We have to thank 
Dr Kirstine Smith for much help in the preparation of this paper. 


THE CORRELATION COEFFICIENT OF A 
POLYCHORIC TABLE. 


By A. RITCHIE-SCOTT, B.Sc. 


: $1. InTRODUCTION. 


We have at our disposal a considerable number of methods for finding the 
coefficient of correlation between two characters from a table of frequencies. These 
methods may be summarily named and classified as follows: 


1. Product Moment. 

Tetrachoric r. 

Marginal centroids. 

Biserial r. 

Three Row 7. 

Variate difference methods including the correlation of grades and ranks. 
Equiprobable tetrachoric r. 

Mean contingency. 

9. Mean square contingency. 


CoS Oi Co te 


Each of these methods has its own specially appropriate field of usefulness, but 
there still remains one class of table for which no entirely satisfactory methods have 
been devised, namely those which contain more than 2 x 2 cells and fewer than 
4 x 4, to which the tetrachoric and mean square contingency methods respectively 
may be applied. 

It was with a view to investigating satisfactory methods for such tables that the 
following work was undertaken. Such tables arise under many circumstances, 
particularly when we can, as in many psychological investigations which depend 
upon the instinctive judgment of some character, definitely assign individuals with 
pronounced characters to either end of a scale, but ave compelled to relegate 
doubtful cases to an intermediate but somewhat indefinite category. We have, in 
a word, good, indifferent, bad; present, doubtful, absent—classifications resulting 
in a frequency table with three categories for one or both characters. 

In the present memoir a normal distribution has been assumed as it has been 
found to be not infrequently applicable and its assumption has given fairly satis- 
factory results even with distributions which are not strictly normal. 


§2. Notation. 


Let the normal surface (when standard deviations are used as units) 
x+y" — 2ray 


Nz (a, y) = ee. 2(1—71°) 
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be divided as in the diagram by planes drawn parallel to the yz plane at the points - 
v=h,, x=h,... and by planes parallel to the zz plane at the points y = k,, 
y = ky .... Let the planes intersect in lines whose projections are the points (A, *,), 
(hg, ky), etc., contracted to 11, 12, etce., where the first figure is the suffix of h and 
the second figure is the suffix of k. 


h, hg hs hy her ans 
n : | ) 
n n n n 
u 21 31 pl 1 lm, | 
I 1 21 31 20) | { | 
1 eS 
N42 Noe Ngo pe Neg | | 
rM.g 
12 42 |i 
hy 2 so] a2} | 
M3 es Ns Wis | 
13| 238 83/1. e243): | F | 
SS ee - 
| 2a | mit 
Kg 
Nig | Neg Na n 
ies = 
| 
Tae) ties Ny. eee Ny- N. 
My. 
Sys Sallie, 
OR 
A -—qc—,-—-—_— qr 
Msg. 


The frequencies 1n the cells and the marginal totals are indicated by 1,1, 149, 
etc., and 71., Ng. ... 0.1, Ng, etc., as shown in the diagram. 


The surface may also be regarded as divided at each point 11, 12, ..., into four 
quadrants. One of these quadrants is shown by the dotted lines in the diagram. 
The quadrant in the position shown, viz. the left upper or (--) quadrant will be 
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regarded in what follows as the leading quadrant and its frequency denoted by m. 
Thus the quadrant shown will have the frequency m3. 
In the ordinary scheme for a tetrachoric table, the quadrants are denoted by 


a, b, c, d, and when necessary these letters will be used with the appropriate suffix. 
Thus the division at the point s . ¢ would be represented as in the diagram, 


h, 
(14) 
as, he eta? M., 
st | | 
k, | ¥ | 
Cn dss | Meg 
‘ eee 
| 
Ms. Ms. || N 
1 


The marginal totals corresponding to the leading quadrant are denoted by 
m;., m., and the complementary totals by m,’., 1.4. 


Clearly any cell frequency may be expressed in terms of quadrant frequencies 
since ze 
eg) Ui — besiege ee ten os] a pen ele 


§ 3. HENnneAcHoric Mretuop. 


In order to determine 7, since we assume the distribution to be normal and we 
know the marginal totals, only one more datum is required. This for example 
may be a frequency block (or the total frequency on a continuous system of cells). 
As special cases we have the “ briquette’’ or frequency on a rectangle of cells or 
again the quadrant frequency. The block may be the frequency contents of a 
group of corner, marginal or internal cells*. Consider (for future use) the general 
case of a quadrant frequency. 


Let 


My. Met 
meh 
where ,7 and ,@ are the tetrachoric coefficients. Then 


= 190, 


Met 


N 
7 TOO Cie GTAR CMRI) Marden sue net aedahastuebezantubtaas (le) 


In using Everitt’s Tables of the tetrachoric functions in which 7, and 0, must 
be less than } we must either rearrange the table or adjust the above formula for 
the position of the mean with regard to the quadrants. It is more convenient to 
adjust the formula as follows, dropping the suffixes as we are dealing with any point 
of division. 


lator sti wir + ete Wal +... 


* A “cell” is the least element for which the frequency is provided in the original data. Cells grouped 
together for any working purposes are collectively termed a ‘ block.” 
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Let Ween WS oh thy 


Mean in a, 
‘d 
az = T) + O) — 1 aP N 


=T  +0,—1L+7) 0) +716, 7 + 72O5'7? + ... 


= Tp tO Earn Oar Cae iceny) 


= T Op BiG (7, 0), 10 pecan ccm ce et cues dee eee ee eee (2). 
Mean in 6, 
m Cc 
Nee Oi 


= Og c= Oe (g20 7), “ates bree vosnes aes cnt eee (3). 
Mean in c¢, 
ie 9 b 
yao 
= Oy — (Ty Gy + 710, (— 7) + 72°03 (— 7)? + «.-) 
= —(l—7™)%— O (7, 6, — fr) 
= T9 05 — Or 0, — 71) ve cackon sae nk seen ee eee ee REE (4). 
Mean in d, W = T G5 + OL (005 7) ew haces pee eeReReeeee (5). 


In place of taking a quadrant we may take a marginal or internal block. I shal] 
only consider the latter as the case of a marginal block may be deduced from that 
of an internal block by removing one of the bounding planes to an infinite distance. 

In discussing the central block we in effect reduce any table to a 3 x 3 (ennea- 
choric) table as we consider it to be constituted of a central block (or group of cells) 
and 4 marginal and 4 corner blocks. I may therefore use the nomenclature for a 


3 x 8 table without any loss of generality. 


ic aE a9 9, + © (a7, 29, 7) — 17 oy — (ur (17, 99, 7) 
= tig Op (ur (oiy40 17) ie aio Conte (ul (7, 39, 7) 


— (eT i 170) (605 Aah) +| (cars 14) (0, = AU) 7 
+ (s73 — aT) (0a — 392)'77 + ote. 


io ot Ly) 


N 


+ (971 — 473) (08, — 191) & + (aT2 — 172) (292 — 192) 7? + ete. 


As an example of the rearrangement of the formula for computation consider 


the case when the mean is in Mp: 


; Mo, ; 
(reel Ta) ia = 97 009 + at1 001 1 + ute o0o 7° + oTs os 7° + --:, 


A. Rivcutre-Scorr 97 


: mm 
(mean imv0) => = 47009 + 171 201 T— 172 20o 12 + 173 203 7? — 
. Moy f , Q, 72 ' 6. 73 
(mean in ¢)  -so* = 99199 + Tr 191 7 — oT 192 7? + ats 103 7° — 


2 n 
(mean im @)) 2 = 475189 + a7 10, 1 ata 185 7? + 7s 103 7? +... 


Moo - 
WV aN (gg — My2 — Moy + My3) 


= ee "= 494) 7 + (a2! + 172) (382' + 192) 7? 
Mei Ta sUs 10a) Tr A CUC, Sastuninsceeucsanet esses (7). 
It will be noted that when one set of categories is symmetrical about the mean, 
i.e. when say 57,’ = ,7, all the terms of odd degree in r vanish. This corresponds 
to the fact that symmetrical categories may be reversed without altering the 
numerical value of the marginal totals and their relation to the central frequency ; 
but such reversal will change the sign of r. 


a (ary’ aay 171) (00; 


§ 4. Stranparp DeviaTION oF + BY ENNEACHORIC METHOD. 


We have now to determine the probable error of r found in this manner. 

Throughout what follows differentials will be used to indicate random sample 
variations, i.e. it is always supposed that the variations are small as compared to 
any quantity varied so that all the dn’s are small, or all the n’s are large quantities. 


/e it 
fa |S | 2 Of ii)) CHB CL) Sa One Fr enone ors ee (8). 


Since the variations of the means and the standard deviations are, in this form © 
of m,,, involved in the variations of h, and k,, we have 


of 


of 
dh, near 


oh, 
Evaluating the differential coefficients, 


dM, = 


k,—rh, 
e7 32/2 a : 
ow [ 2( (h,, y, 7) dy=N oe ie Bt over dyis, ae (10). 


This is the area of that portion of the dichotomic plane «= h, which bounds 
the quadrant m,,. But the area of the whole dichotomic plane is 


+0 
N [ z(h,, y, 7) dy sme TOE SIE) 2 ak oe ee (a, 
so that if we write £ See IN 2 BA Ores ees UNE eS ae ee (12), 
‘ k, —ths 
where Ay = roe ee Bie 2 hype ays eee hes (13), 


the factor A,, will be that fraction of the whole dichotomic plane section, which 
bounds the quadrant m,, and will have no dimensions. The value of A,, may be 


taken directly from Sheppard’s tables of the probability integral entering with 
Biometrika x11 7 
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the argument ee It will be convenient to refer to this tabled integral as &, 
= p 
so that 
k, — rh, 
Ay © (a) i a ee 14). 
Var? a) 


It is convenient to note that with this notation 


0 N S —1_2 7 
Further since a alee ert ada, 
ia AN ee 
The gd Was ee 
and dh, = aig 
2 RN ieee 
Hence aie AN Abie 2 Beg 15 
n ah, = eis: Nay GIN gs a dcaae eee eee (15). 
Sbe of 
Similarly ah, dk = BGO. sah ees ee Eee (16), 
where By, = & he — hy (17) 
he = at os Lilie ; 
, HR SD oh 
Lastly ie PE la (x, y, v) dudy 
he ple 
=WN ie in dp” (x, y, 7) dady 
he phe qe 
= WN ie ae ech z (a, y, 7) dxdy 
ma Nz (hs, ky, r) 
HEY ag cule ake dad nseadceeed Pumenb en As Seles aut cso cee eae (18), 
which is the length of the ordinate at the point (s, ¢.). We may now write 
dm = Adm, + Beams Net Ote ete: eee (19), 
and =y1d" = Adm, + Budm.,— Wins) se. oee ee (20). 


Considerable use will be made of this formula later and the following abbreviated 
notation will be used: 


A,dm,. + Bydm., — dm, = 0P p= — Ya dln eee (21), 
and Ags - Bane Ne Payton eee (22). 


The reader must be careful to note that 5P,, is not dP,, but only a part of it, 
and this symbol is used here as at once a conventional abbreviation and a memoria 
technica. 


. 
Since Noo = Msn — Mg — May Maas 


*. ANg, = AMgg — Uy — AMy, + dingy 
= Ayding. + Bodin. + Xodr — Aypdmy. — Byydin.y — Xi247 
— Ay dm,. — Bydm., — xy dr + Aydmy. + Bydm., + xd... (23), 
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= (Xo2 — X12 — X21 + X11) Et = (Ag, — Agy) dmg. — (Aq, — Ay) dm. 
+ (Boy — By) ditt.g — (Boy — By) dim.y — Uttgg wreceececeeeees (24), 
Reference to the diagram shows that A,. — A,, and the other coefficients are 
the proportions of the areas of the trapezettes bounding the briquette of volume 


Ng. These area smay be systematically named for the whole table thus, 


that is, the areas of the planes meeting in the line of which the point s, ¢ is the 
projection, in the direction shown, are named from the point so that 
A, — As, +1 = Ost, 
Be tm B,_1, oe Bet. 
Hence we may write As, — Asa — Gey 
Aj, — Ay = O42, 
By — By = Boo, 
By, — By = Bo- 


If now we notice that since m,. = N — ng. etc., and m,. = ,., 


dM. — — ONaa, 
dm.g = — dN. 3; 
die die, 
Wii Wap 


we then have 


= (X11 — X12 — Xo1 + X22) dy = — (aygdng. + ay2dn,. + Boodn.3 + Bo dn. + ditzy) 
cena (25). 


Expanding this in terms of frequency volumes this becomes 
(X11 — X12 — Xo1 + Xe2) Ur = (ayg + Bor) dq + Boy Any + (429 + Bor) drs; 
+ dyzdnyy + Mirzy + AyodNg2 + (42 + Boe) Uys + Bodie, + (ag2 + Boe) dts 


It has already been shown by Pearson (Biometrika, vol. 1x, p. 1) that when 
random samples are taken from a population so large that its composition is not 
appreciably affected by removing the samples we have the following relations: 


Ong = Mas (1 = “*) errant ho p. (27), 
Mean (dn. dnyy) = — Aare eco let ean ness anne (28a), 
Mean (dn, .dn,.) = — ie chee ea Eikeeeaan (285), 
Mean (dn,. dn.) = ngy ue a SE ay eet, eS (28c), 
Mean (dn,, dn. +) = — eet Pea een Crees (28d), 
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Ngsi Nz 


No iiss ( 
Mean (dn,, dn, .) = gg (2 — “) AERA RI IO (28 f). 


Mean (dn,,dn;.) = — 


Hence squaring both sides, summing for all possible samples and dividing 
by the number of samples we have, 


(X11 — X12 — Xe1 + X29)? OF? = (42 + Bo1)? Nyy + Boy? M1 + (ee + Box)? Nga 
+ G42? qq + Nyy + g9?%g9 + (aye + Bop)? Mg + Boe? Mag + (dee + Boo)? Mos 
l (a2 + Bor) M13 + Bortar + (agy + Box) N31 je 
Nr + GigM1o 1 Noo ti Goo Man) eee (29). 
+ (a42 + Bos) M13 + Bor Meg + (a2 + Bog) Noe 
The expression within the large brackets 
= OygMy- + AggNg. + Bo N.1 + ByyN.2 + Noo. 
Calling this Nm we have 
(X11 — X12 — Xer + X20) Or? = (a4 + Boy — M)? 2, + (Bo, — m)? Noy 
+ (ag9 + Boy — m)? M3, + (ayy — M)? My + (L — Mm)? Nyy + (agg — M)? Ngo 
+ (a2 + Boe — m)? M3 + (Boe — mM)? Neg + (Go2 + Bog — m)? Ngg (30). 
The following form for m is instructive although giving an apparently less 
symmetrical form than the above, 


1 
m= 75 (Gist. goa Oa a ei ee Wes) 


1 
= NT | (Ay — Ayy) my. + (Boz — By) (N — m.,) + (By — By) my 


+ (Bog — By) (N — mug) + M22 — Mg, — My + ma} 
= e | Aven. + Byym.g — My — AqymM,. — Byym., + M4, — AggMy. 
— Boy M.y + Mg + Ap, Mg. + Byym.1 — Mg, + N (Ag, — Agi + Boo — By} 


=a, Re 11 12 21 22 


We may then write 
(X11 — Xie — Xa + Xv)? Gr 
3B? BY? 
= (a1. — O99 + Bor — Bos + 4 M1 + (0 = Ge5 + Bo: — Boo + ¥) Noy 


2 2 
ae (ie Bo + » M31 1 (a1. =033 10 Bae 1 ee) N12 
1 1 2 2 
a (5 — Gee 5 — Bos + 3) Nee + (0 Sf) N30 


2 


2 2 
al (a1 — gg + -) M43 1 (0 — ge + =) Neg + (*) Lng) doce: (32). 
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It will be shown in a later paper that the coefficients of the cell frequencies are 
functions of moments of the frequencies about the mean. 


If any relation between h,, h,, k,, k, makes 


Mine ais Xen ree = 0: 
while the right side of the o,? equation remains finite the value of the expression for 
o,2 will become infinite. x1; — X12 — X21 + X22 Obviously vanishes (1) when h, = h, or 
k, = ky, i.e. when either of the central categories becomes vanishingly small, and 
(2) when h, = —h,, ky = — ky and r=0, i.e. when both sets of the extreme 
categories have equal frequencies and the correlation is very small. 


(1) When h,=h,. Then x1;=X12, Xo1= X22» 12 = Gog =a Say, Bo, = P,.= 0, 
Ng = Nop = Nog = 0, Ng. = 0. 
a (ny. + 73.) 


Then m= N (0. 


and the right side of the o,? equation reduces to zero giving the indeterminate form 
: for o,”. 


(2) When h, = —h,, kj = — ky and r= 0. This case will be discussed in the 
next section. 


§5. Sranparp Deviation or ENNEACHORIC r IN SPECIAL Cases. 


Two particular cases are of interest, (1) when 7 = 0, (2) when the table de- 
generates into a 2 x 2 table. 
m. 
(1) Whens=0 A, = & (k,) = We? 
Mg. 
By, ms € (h,) mz N ? 
Py = Agms. + BygM.g — Mee 
Noe M 4M Mg eM 4 ve 
== a. N Mse 
= ies 
Mig— My 1. 
Hence Ct Ane A a aN’ ane 
Me— My Nz 
dy = Ay — Ay = ai Paar 
eat Ng» 
Similarly Bos = Boy = Nie 
M = Ago + Boy ssi ease 


eae! 
= ae W = iN (M41 — Myy — My, + Ms) 
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X11 — X12 — X21 + Xoo = N (HK, — HK, — H,K, — A, K,) 
= N (H, — H,) (K,— &,), 

and the right-hand side of the equation reduces to 


WENe Pe a Bees 

(7) (M41 + M43 + gq + M33) + (ma 
Nog — Ng.\? Hy) Wns  Wae\* 

Eis (i ) (yp + M3) + (1 =f — NT | | Nop: 


= Nog this resolves into 


(Ree), (N — ng.) (N — 1.9) 


; No Ne; 
Remembering that 2” 


(N= Mee\? (M2)? Ne: 
N N (Sr) Gye) Oe 
ONS oe as PY ao war)! ee nal) Ns.N ey 
Fe) (yt) — meat (Sy) NW 
N — ny.) (N — 0.5) Ng. Neg 
ei aates) _ ea) isa {Ng.N.g + (N — Ng.) Neg + (N — Ney) Ny 
+ (N — ng.) (N — n.2)} 
Ny. (N — ng.) Nog (N — n. 
Se eee (33). 
Hence 
aK 2 ee re hase N—n,.) n..(N—n. tl taneeGrens 
INCE Ci ae Phe ( W 2) .— N 2) a ee yy a 
= re Dear nopy 34 
and om “JN Nh A) (Kak os (34). 
This may also be expressed as follows: 
2 9 _ Mae (N — Mg.) Np (N — 2-9) 
Ong? Ong = N N 
NzMeg Nye + Ng. Ney + Neg 
Nie ON) N 
= Noo (My Nag Nay te Mgg) cee even a eeelectentene (35) 
i 
A Ve Ngo (My + Ng + Nyy + Ngs) 
an oo 2 Se gs ce nbs ona Cae 36). 
NH, — Hy) (hi — Ba) 


When the extreme categories are nearly equal and 7 is zero H, = H, and K, = K, 


and the value for o,? becomes infinite. It is necessary then to keep second powers 
of differences to determine o,2. Keeping second order terms we have: 
sco 
0 C 
dm = al dh +; 


ary Ola alaiGen orf 
sch + ak +L ay +54 (anya + 2 


=, (dk)? + om (dr)? 
2 (dh? Oe Or? 
o2 a2 q2 
+2 ah d (dhdk) + 2 ae (dhdr) + ou (akan) ARB GHIEaS 06 (37). 
C2 


To find aye ete. we may proceed as follows. 
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By straightforward differentiation we find that 
OP st A? — 2rhy 


k 
orf ) | N iP é 21-7?) 


Ohm Ch \oen/pa lew ayy 
24h? —Qrhq Pye} 
hN fc mY a Sell rN _W +k? — 2rhk 
= — —— é 27) dy -+ ee 2(1—ry2 
QnV 1 — 72 J -2 : nV 1 — 7 ee 
"™ 
ay, (- Wil = *) Pee ered oe en, oie. (38), 
a SN ( = *) ee ee (39), 
o*f ax INE _, (k= kr) (k= rh) 
ay? = a = —_ pe | T 1 = PD) XG oiptalietslersiisverevsfstulsieiniate (40), 
CjmeeOXn = he 
Oh Or Oh, ce ro » (41) 
Oy oy = k=th 5 
anor Ai ] oF Pe) » (42), 
2 of 
ne == af eset cam te ima: t ouiat ooua nouns tet nde ses (43). 


Hence summing for all samples and dividing by the number of samples and 
denoting this operation by %, we have, since quantities of the first order dis- 
appear, an equation involving % (dh)?, & (dhdk), etc. We may determine these 
values thus: 


din. 
Won = WH,’ 
m 
ms. (1 — => 
, dm,.\* o( x) 
~ S(dh,r=% cae) = ee (44). 
Meg 
(gee: 

- Similar! & (dk)? = — ( w) 45 

y (Uy)® = a oon eereerteeeeetin (45), 
mM... 
Sida) = s (oe Z lee aN : (46) 
Wits vt — wae) NAT a sen een eer e ee oes 
To find (dhdr) we have 
— ydr = A,,dm,. + By,dm., — dmg, 
eis eee eee 
xdrdh, = NA. 

. — NH, & (drdh,) = Ag, & (dm,.)?+ By & (dm,.dm.,) — S (dm,,dm.,) 

Mg. eo Meg. 2 
—PA ie: (1 — 7) + By, (an ane ret) — Msg, (1 — *) echoes (47). 


After simplification this becomes 


(1 = *) 1 eae = Bi (Wet =. Mst), 
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ee na eee ™;. 4 ce 

 & (drdh) = — a \( ) Py — Bim md} aie (48). 
Similarly S& (drdk) = — : (i *) P,,— A(m,. —m \\ (49) 
~e, 2 eo, ‘ NKy N st se st J eee ees ees . 


Equation (37) then becomes 


0=N (- hi *) % (dh)? + N (- kKB — ¥4 % (dh)? 


x { , (h— kr) (k— rh) jes 
‘reals ear LS @yr25 —5 xX + & (dhdr) 
ge 4 & (dkdr) + 2yS5\(dhdh)......1 ss ee (50). 


.. Substituting the values of the $’s, and transposing o,? we find 


mM,. 
= m,. (1 —-= 

x {a Gee my a ( x) : ( 7) 

peers | ap cee o2=—-N(hHA+?r v) eR 


Met (1 = 7) 


} y Meg. 
+2 aE Pe ‘NHy (1 7 iT) ler mn Bs, (m.; a) mao} 


ok — rh 1 i mM. 
25 may” Wit (i _ ae Ps — Ag: (ms. — sdf 
M,.Mag 
= 
Ey) meme Gi, sie,(o:0'n vsa‘a pie, ote aveielececeye ebetelorere aieie’e eia¥afetats ehele oe EOE O1). 
+ OY ER ree (51) 


When r = 0 and hy = — hy, ky = — kg, x reduces to NHK, etc., as above and the 
equation reduces to 


NE Khe (1 - ") tas (1 aa ie. (52), 


aN («KB +r x) 


Met — 


NH N NK 


If we take this equation for each of the four points 11, 12, 21, 22 and combine 
them according to the scheme mg. = 14; — Mg — Mg, + Moo as before, the left 
member of the equation becomes 


— NHK (hyk, — hykty — hgh, + highs) 0,2 = — NHK (hy ky + hy ky + hyky + lyk) 07? 


= —4ANE Kh kyos eee (53). 
In a similar manner the right side reduces to 
h my. k m. 
WH, (ms ot N Nan) + WK, (1 | W 7) nda fae eee (54), 
which may be further simplified as follows 
m My. 
Ma + ae Noo = No 4 WT (m.. — 249) 


A. RirentE-Scorr 105 
_ 9 Mae Mey (1 ee 8 ) 
rs N 
ep + "2 oa WR a een (55) 
Similarly Noa + i Nog = 2 a re it ets Saeceeemionce (56) 
Hence substituting we have 
No v C. 
SAN Kaka oe — 2 We ie my. + K, mas} Deeb oe ce fi (57), 
dn a Eh 
and G2 = INHKD K, = M,.+ EK, ma| ene eect ee (58) 


(h, and k, are of course negative numbers) which remains finite when the extreme 
categories are equal to r = 0. 
(2) When we put h. =k, = © the enneachoric table degenerates into a tetra- 

choric table and we have Ts Ss Ning = 05 

N31 = Nga = gg = O, 

(Oo = Mei Uy 

X12 = Xa = X22 = 9, 

GyoMy- + Boy -1 + Nop 


and hee N 
But Ay = Ay — Ayy 
= wo — rh, a= 74) mm 
oe ( |. 8 ae a 
Similarly Bx = 1 — By. 
Hence m= (1 — Ay) mi. + = Bu) M1 + N = my. — Ma + M1 
N 
IN = (A, %: aF Beet = N41) 
= N 
al Pu 
=] Ne 
P : , 
“X16? = ea Ay— By + 1) oe Ge iz Bu) a 
P : Pav? 
a (3? ae Ans) Nie + (=) OO hg ee Meee (59). 


This form of the standard deviation of a tetrachoric correlation will be referred 
to again later, and will then be reduced to a still more symmetrical form. 


From the above it will be seen that we can determine the correlation coefficient 
from any frequency block, assuming a normal distribution, but the accuracy of 
this determination varies with the position and size of the frequency block. The- 
probable error (-67449c.) varies from cell to cell, and an unlucky choice of the work- 
ing cell may lead to a correlation coefficient with large probable error. A correction 
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of this latitude might in some cases be obtained by using another cell. But as the 
r from this cell would probably differ from that previously found, and as neither of 
them would be identical with that of the normal surface from which they are sup- 
posed to be sampled, we must find some means of approximating to the “ best” 7. 
The most general method of doing this, following on the above, would seem to be to 
weight each of the frequency blocks and determine the weights so that the re- 
sulting probable error of the weighted 7 is a minimum. In doing this we must 
have regard to the fact that the variates we are dealing with are not independent 
but correlated. We must consider this method. 


Let the polychoric table have p rows and q columns so that the indices of the 
last row and column are lq, 2q, ... pg and pl, p2, ... py respectively, and let each of 
the frequency volumes be weighted by an arbitrary weight w,,, W , ... dicated by 
the same suffix as their respective cells. 


Then WN + Wiese + -- 

Wy My + Wyy (Myg — 41) + Wey (Mgy — My) + Wye (Mag — My —— Ma, + My) +... 
i Wee (Mise — Mey gd — Mg poy len ow) alae 

= (Wy, — Wyy — Woy + Woe) My +... + (Wee — Wert, t — Ws, t41 + Wot, t+) Met + «+ 


I 


= 4 Nag Hb Wye Mye to oc. Weg Mean F nate vhs hoe ce nace ee ee eee EERE eRe (60). 
i ] 
Then N (14 %y7 + Wiis + -.-) = W (044 M41 + WyeMy2 + «.-) 
= Wi (47 ACh + i) + Wi2 (47 305 —- ) = woe wee cere er eeevee (61), 


which is an equation to find 7, using @,, as an abbreviation for 
sO + gh 2-100?" + o7g-0g7? + ete. 
(Compare the usage in equations (2) to (5).) 


Myqr Moy) +++ Mp1, Mp, --. are complete segments of the normal solid and are inde- 


ad = .T pq, and these terms disappear from both sides of the 
equation. The w’s of the cells in the last row and last column of the tablemay therefore 
be dropped and we have (p — 1) (¢— 1) w’s to determine. The probable error may 
be written down in the manner already shown and as there are (p — 1) (¢ — 1) 
independent frequencies there is sufficient data to determine the w’s so that the 


probable error is a mimimum. 


pendent of 7, Le. 


There is no essential difficulty in carrying this out except that the coefficients 
rapidly become very cumbersome as tables increase beyond 3 x 3 for then the 
simplification dm,. = — dn. is no longer available. It will be found however that 
the same result. may be derived more simply from the method discussed below. 


§ 6. Potycnoric MErnop. 


The frequency surface divided into p columns and q rows is divided at each 
point 11, 12 ... into four quadrants, and for each of these divisions a value for 7, 
ViZ., Ty, Tyo ---, may be found by the tetrachoric method. These may be regarded 
as approximations to the true value of 7, and their weighted mean found, the weights 
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being determined so that the probable error of the mean 7 so found shall be a 
minimum. 


Cnt + Crt t .- 
Let r= Boe eee tae MOET tbs aoeaek 62). 
CROs hah ey) 
Then (Gite Crete Oi Oa grt Cel po Ames 2s ee cesdeenes sues (63). 
Squaring, summing for all possible values and dividing by the number of 
samples, (GR ot = (CC act\P 12>) (OC pores) Wa su)’ asin. (64), 
where Ost = Ore Bes, ft’ = rg rey 
Let S== 2 (Cyto)? + 2% (Cet Cee Oscoer Ree, st’) 
CnC): 
Then for a minimum 
os os 
LS = ~~ dC iG. = 
g aC, dC, + aC» Che 0, 
OKO dCi, + dCi, pire Ob 
aN os 
. (= =(-z~—-—A)dC,,=...=0. 
(; = d) dC, (; = ) dC, 
Coy? + Cy2071072 Ruy, 12 + Cy3011013 Ry 15 + oe = r, 
ORomGre Matas + On ster er + Cor Onettra a5 + <a A eee ccc ces (65). 
aa i : = on ae i Sere 
O71 O12 O13 
1 
On 1 Rup Rit 
| | ee Le RE eee (66) 
| wo R mie it R 9 
| O19 11,1 12,13 
1 
| O13 Ry 13 Ry, 13 I 
AA 
Then CuO a ae raat 
‘0006 
NN ae 
, _ __ “Xoorz 
CisO15 = Ne Bible Kole rereleYsl.eienevelereiaretslertia 8isisievs ci eeieieieie (67) 
Since A is arbitrary we may put & (C,,) = 1, 
AeA A Nine 
SON) 0000 oo, oor | _ [ 
(Cs) | 1 is on 2 Acco j 
A (A — Aoooo) 
eaten — MED URCREE Ino ee oe 68 
Aoooo ( ), 
A 
FN pepe te IS nahes Te ake ese ntaheees 69 
A= Bow me 
Cif = Aoou 
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bis? (Cs10s¢)? =f 22 (Cs Cee ts ogy Ree, st’) 
il A 
SS |Cate (ee a Aco Prise IE Noo P42 «st 


SS Aoooo A 
a6 Avast ae Mecietae nce} 


Ost 


S 


[Bs Avooo \ Ost ) 
= _ 2G Bowo eA ee (71) 
A— Aoooo Aoono =i 
; A 
and since ZC,,; = pO sacivsetag inca tes oe nhs ten (72). 
st o Neace ae A " ) 


§ 7. Comparison oF PotycHoric AND ENNEACHORIC COEFFICIENTS OF. 
CORRELATION. 


We may now compare the polychoric coefficient (r,) with the enneachoric 
coefficient (7,) previously found. 


Since >5, (COs) Se Ike 
lp = CAntan + Calas + ee wer meee ree nv ccc ccserreccene (73), 
s dr, == COT + Ore Olas + doh —— Ts Ch oe a Cis ee eee eeeres (74), 
N11 X12 


— (Cyydryy + Cyydrys + «..) 


== (A,,dm,. + By, dm., — dm) + oH (A,.dm,. + By.dm.. — dims) + ete. 
11 


Transposing and rearranging we find 
CO 
11 


Xu 


dm, + Cis diy, + 
X12 


Crs 
270 (A,,dm,. + Bydm..+ x47) + oe al Byodm.. + X424742) + ete. 
12 


11 


A 
Cu (e dh, + Chi dk, + in Ap a) +5 Cie oe dh, + Chie dity + fe dyy) + 
X12 


2 5eR oh, Ok, Ory; oh Ok, Oi. 

fase RE (76), 
h, ky ; 
where fa=N | z (uv, y, 7) dxdy = mgt; 
6! (G} C Gi 
u My 41 2 My +. = a (179 199 + Gro) + —? (179 299 + Dye) + 

Xu X12 Xu X12 (77) 

eda ii); 


using ©; as in equation (61), page 106, which is an equation of which 7, is a root. 
If in this equation we put Cy == x, Cis = — X12, Cor = — X01, Coo = Xap, and the 
remaining C’s=0 we have the equation for the enneachoric coefficient and r, 
now appears to be the weighted mean of four tetrachoric 7’s, 


tae = Niele sXe ot ae Xeohae | 
So My hin am Bon 
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Comparing this with the generalized equation (61) for enneachoric 7 previously 
found in which all the frequency volumes were weighted, we have 


Y 
m Cy 
O11 Sa 
X11 
Y 
Cb 
Wi2 haar ’ 
X12 
oy Ost 
Ws = > 
Xst 
and since Wet = Wet — Wyit,t — Ws, t41 + Ws, t415 


it can be easily shown* if there are p rows and q columns that 


Crt Oni Os tay tun gig + +1: Og oi + Wear gt sia gia t +++ Weia,g—4 


+ Wy-1,t + Wy—2, t4+1 + tote Wp.a Stole dete lercieiéve; steve terete e lexersie Bc ite.) 5 
that is the sum of all the weights having the same suffixes as the points contained 
within the d quadrant of which n,, occupies the corner. 


* Consider a two-fold extension ruled and named in a manner similar to the polychoric scheme on 
page 94. 


Then My = Nyy 


Met = Nyy + NyQ + Nyy veeeee + M4; 
+ Nq1 + Nog + Nog «00s «+ Maz 
+ gy + sg + Ngg voeeee + se 


Hence 0447741 + oy +... 
= Wy Nyy + Wy (Myq + MyQ) + Hyg (My + M2 + M43) 


+ Wey (Nyy + qq Heroes Ny 
+ Noy + Nog + ovevee Noy 


+ Ney + gq + eveeee Net): 
Tf we rearrange this in terms of 7,1, 2,2 we shall have 
Nyy (Wyy HOy9 +erseve Wee) 
+ Nyy (Wy2 + Wyg + ever st) 
SW Ny + WyQNyg + veeeee WetNgt + sevens WpqNpq+ 


It is clear that w,, will be the sum of the w’s belonging to all the m’s of which n,, is a constituent. 
that is, from the figure, all the m’s whose boundary lines lie beyond the lines h=s —1, k=t—-1, i.e. 


st 
Wp =Z Vw 
11 
The relation Ng = Mop — Mg_y 4 — Mg» t-1 + Mg-4° 1-4 


may be compared to the partial finite difference in two variables 
A,Ay U;; a Uz41, deol a Uy41, Po U;, Y+1 ar ies y> 
which may help to make the above relation clearer. 
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§ 8. CoMPUTATION OF f’S AND o's. 


As the computation of 7, even for an enneachoric table is somewhat lengthy, it 
is necessary to have a definite scheme to work to. In addition to this the values of 
the R’s when resolved into their constituents present some interesting features. 


A new expression for tetrachoric 7 has already been deduced from the degenera- 
tion of an enneachoric table. The following is a derivation of it directly and in a 
more symmetrical form. 


Consider the tetrachoric table 


D 


a b Ney 
F E L 
c d Was 
G 
Ny. Nos N 


Let A and B have the same significance as before, 1.e. A is the fraction which 
the area of the plane DE is of the whole dichotomic plane and B the same for FEZ, 


and write gP = Any. 4 Bie 20 eee (79), 
where the a suffix is used to indicate that it is the P of the leading or a quadrant. 
Then since the fractional area of HG will be 1— A and of HL, 1— B, the 
corresponding P for 6 quadrant will be 
A ke — Ang. +- (1 aro B) Ney a b 
=A(N—n,.)+ (1 — B)n., -— (n.. — a) 
= AN — (An,. + Bn., — a) 


= AN og cca von ctneauk taut ston oe Rea ee cee eee (80). 

Similarly wb = BN = 4 Pee eee ee (81), 

aP = — N(A BR) ee Pie oie see eee ee eene (82) 

Hence we have ae phe ek EGP HEN ene (82) bis. 
We have already seen (20) that 

— yd1 = Adm,.+ Bdm:, — din, — 0.0 Pi cade eee (83). 


Using the symbol % as before to denote the operation “sum for all possible 
samples and divide by the number of samples” * we have ; 


* Tt would be useful to have a distinctive name for this operation, verb as well as substantive. 
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a2 = & (5,P) 
= S (Adm,. + Bdm., — dm)? 


2 
= A’m,. + Bem., + my + 2ABm, — 2Amy, — 2Bm, — oe 
2 
= At(a +0) + Be(a+ 6) + a+ AB 24 — 2B)a— 4 
2 
=(4+ B—1)?a+ A’c+ BO - oe a Ror RIE Orr ere cm Ree (84). 
But 
w=A(at+c)+ Blia+b)—a=(A+B-—1)a+4+ Ac+ Bb+0.4d...... (85) 
and a+b+ct+td=N, 
Rese) (One)? 
wP\? we\? ; we\? wP\? 
=(44+B-1-%)) a: (3-4) b+(4—-%) c+ (0-*) d 
Hye Goes ae 
=) 
ui Es {Pra sh cel AT) cea ae ees a (86). 
Further (— gP)a+(.P)b+(%P)c+(—,P)a 


=a{N (A+ B—1)—,P}4+ (BN —,P)+c¢(AN —,P)+4+d(—-,P) 
=N{(4+ B—1)a+ Bb+ Ack}— NP 
=N{A(a+c)+ B(a+b)—a}—N,P 
= NIP = IN IB =O camedeie cooge tadoc Cen eRDn Con aS Een OCa eee ener reer ee (87). 
The above form (86) of the square of the standard deviation (omitting factor) 
is interesting as involving only the squares of the P’s. Since the P’s are connected 


by the relation (82) bis and (87) their values may be determined from any two of 
them. 


The R’s. 
Since — Xt Ose = OPs: 
and — yy Or yy = OPyy, 
XstXs't Ost Ol se’ = OP OP yy 
and  _XstXs' Ot Fs Ree. y = BD (OP 28P ov) = Soe. ee’ (SAY): 


In conformity with this notation 
2 Ci 
Ase ae = evecr: 
It is useful to have a verbal rule for writing down such mean products as 


> (OPP .4). 


The following will serve. 

Multiply the detached coefficients of the differentials in the 6P’s as in ordinary 
multiplication; strike out the products in which the related frequencies have no 
common frequency and insert the common part of the frequencies after the related 
coefficients. From the whole subtract the full products of the P’s divided by the 
total frequency. This may be proved as follows: 
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Let p and q be any frequencies in a given distribution in which the population 
N is so large that sampling does not alter its composition; then we have the well- 


known results (p. 99) 


S (dpdy) = — 7 
Now let p and g have a common part c so that 
p=p te, 
q=4 +. 
Then % (dpdq) = Sd (p' +c). d(q' +0) 


= & (dp'dy’ + dp'de + dq'dc + dc*) 
_ Pq pe re(l 4 


i N N WN N 
__Wt+OW'+9,, 

N 
=e-f. 


Now the mean product of any two linear functions of p’s and q’s, 
S= Dd (4Pit tePe t+ -..)-d (hig + kage + --), 
will consist of the sum of the mean products of terms such as 
i,dp, . k,dqy. 


But S (i,dp, . k,dq;) = 14; (dp, . dq) 
= Us ky (ct aa Pel) > 


where c is the common part of p, and q;. 


Therefore Se3 fi k, (cs = aa 


SE pees Jee 
Hence the rule. 
As an example consider S,,.9,, 
Syy-21 = > (8P1,5P21) 
= 9 (A,,dm,. + Bydm., — dm) (Ag, dm,. + By dm., — dmg) 
= Ay Ay. + Ay Bay my — Ay im + By An Mg + By Bom, — By my 
— Ag my — Byymy + My — sme 
= (Ay + By — 1) (dor + Boy — 1) my + By (Aor + Bor — 1) rq + By Borns 
+ Ay, Aoi Me + 0. Asses + 0.0. Mop 
+ Ay, Agi M3 + 0. Agy.Mo3 + 0.0. Mg. 
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1 ( (Aq, + By — 1) m4 + By M1 + By N31 
Sa + Ay M2 +0. Moe +0. Ngo 
IE Ai Mi3 + 0. Mog + 0. Nag 

(Ag, + By — 1) my + (Ay + By, — 1) Ney + BoyNs1 

ae Mes Agy M19 + Agi Noo + 0. Np 
a Agi M43 + Agi No3 + 0. N33 
12 P 

aE (4n ar Ekteee te =) (4n Ba a) M1 

IP 12 P q P 
at (21 i *) (41 ied Oi (eet = Noq + (Bn a -) | Ba a =) N31 


12 Jes P. E 
ar (41 = 2) (An rT =) Ny. 1 ( a =) (4m = =) Noo 


P P Iz 12 
ae (0 W) (0 7") N32 + (An ai =) (An = ) N13 


Iz B PL P. 
ails (0 a =) (401 a) Nog 4 (0 =) 0 — 7) Ng aieestes soe SG (88). 
In the above the P’s are ,P’s and remembering that 
oP. ne 
Ay,+By,-1- a= = We? ete., 


we have 
ie P, Ales Pe JB oe 
Sioa = ( oH) ( ) My + Ga ( Tt) Noy + (=e) (Se) Nas 
‘BP 


PEN P. oP P 
* (a) Goat (ey) a) net Oop?) 


1e. ; 
+ CP) EP) ne (8) Er) net 


i 
= yr taPrraP ata — ¢PiyaP artes + oP irc? e131 


+ pPyyePor (M12 + M413) — oP irePo1 (Moe + Nez) + oP iraP or (M32 + M33)} 


The relation between the coefficients in the above expression is very simple. 


We have already seen that 


New? ei (— gh)? 4220 (6)? 0 GP Ad (= oP)? cnsicrece ts (90). 
In the quadrants of a tetrachoric table write the P coefficients. Thus 
a aP ae ae 
++ »P a ak 


The a frequency is related to — {P, etc. 
Consider now the empty scheme of an enneachoric table regarded as a tetrachorie 
table with the point of division first at, say, 11 and second at 12, and write in the P 
coefficients as above. 


Biometrika xm 


8 
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Divided at 11. 
=e ay ame Pu 


pPu ara an ce leat 


Pu ny wu ae oat 


Divided at 21. 
doy Ni meea lon iiherelesr 


pPo Pox a Pon 


pPo pPoy = Py apt 


Now if we superpose these two schemes upon an enneachoric table with a 
frequency in each cell, each cell will then contain the P coefficient and frequency 


of each term of the expansion of S,,. (with the omission of the factor x) thus 


(— aPu) ~ Pa) Mn (Py1) (— aPor) Mer (cP a1) (cPo1) Mo1 


(Pir) Por) M2 (— oP) (o Por) ee (— oP) (— aPex) Moe 


(0P31) (oPo1) M13 (= aPi) (oPo1) Mos (— aP a) (~— aPar) Mop 
When 11 coincides with 21, R becomes = 1 and the mean product degenerates: 
into the square of the standard deviation. 


This may be summarised in the following table in which the letter, a, b, etc. 
gives the suffix and the sign gives the sign of the P required. 


Py P,, Psy Py» 
Ny -d -—d -d -d 
N19 +b -d +b -d 
N13 +b +b +b +b 
“er +¢ +¢ -d -d 
Nes -a +¢ +b -d 
Nog -a -a +b +b 
ee +¢ +¢ +¢ +¢ 
Ngo -a +¢ -a +¢ 
Nga -a -a -a -a 


Thus the coefficient of 39 in Syo.9, 18 (+ ¢P 42) (— oP 21). 

This table is sufficient for a polychoric table of any size since any two cross 
points st. s’t’ in the table, with the planes through them divide it into nine portions 
or groups of cells, each of which is represented by one of the above cells. 
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The relations between two superimposed tetrachoric divisions involve the deter- 
mination of ten constants, four o°8, 041, O12, 21; So, and six R’s, Ryz.42, Ryy-01; 
Ryy.99, Ry2-21, Ry3-22- Roy.22. The o’s follow the example already given, the proper 
suffixes being attached. The value of S,,.,, has already been given. The remaining 
five S’s are as follows: 

1 
Siw = W2 (ePrraP eM + cPrrceP rz (Mar + M31) — wPiraP 12 M2 — oP rircP 12 (M22 + M2) 
rod Ceres ers OPE Wem ed ae Pe 2s | COPE Ree C9 eee ROR Ee rE (OL): 
1 
Sir. = ya i1aP 22 41 — cP yaP 22 Mor + ePrreP 22 M31 — oPrraP 22 M12 + aPrraP 22 M22 
6 
= aP yy ¢P 2232 + oPrreP22 M13 — aPuvP 22 M23 + aP ral o2 Mash verre (92). 
1 
Sj2-21 = Ne {aP iz aPo1 M1 — cP rzaP a1 Mar + cPrzePo1 M31 — aP 126P 21 M12 + ePizeP a1 Moe 
‘ » 
= ePyeaPo1 M32 + oPr2vP 21 M13 — aPi2oP 21 M23 + aPr2zaP 21 Maa} +++ (93). 
1 
Sy2-22 = V2 teP al 22 (M41 + M2) — cPizaPo2 (Mar + M22) + ePrecPo2 (M31 + Ms2) 
+ 5PizoP 22 M13 — aPr20P 22 Mo3 + aPrzaP 22 (Nag)}-veeeereereeees (94). 
if 
So1-29 = 2 tePoraP oe (241 + M1) + cParcP 22 M31 — wPoraP 22 (Miz + M22) — aPoicP22 Noe 
5 
steep atglaal asics Moai cia gai al oa lagre xovcaveseeseeecont (95). 


A more convenient form of the above for actual computation purposes will be 
found on page 120. 


We may now by means of the P’s express the standard deviation 7, in a form 
consisting of sums of squares. 


Y 
—(3C,) dr = 5 ( SP.) Cee (96), 
1 2 ’ 2 1 ine, 
. (BCulPot = {E (SLSP a) = E (CH) Sener + 2B By 
Xst Xst XstXs't’ 
Cu (x Cr. Cis 
Peel pelt dE + — Siig +. 
Sane WAGE. 0 or as pe Oe emmy cea ) 
CafC. C;: C1: 
+ a = Sui, 12min a Si, 12 + ate Si: 13 ale +] + etc. eee e rere reeeee (97). 
1 


Now consider the S’s to be expanded in terms of the frequencies and pick out 
all the coefficients of the frequency n,, say. The coefficient of the n,, taken from 
Sim, vm Say Will be : Pyy.: Pym in which the quadrant suffixes will be determined 
by the relative position of n,, to lm and I’m’. Let these undetermined P’s be de- 
noted by p. We shall then have as the complete coefficient of n,, 

Ci (C 


vf Go (G} 
ee ee ae a 047-9) ey “p +...) 
Xu = Pu-Pu X10 Piu-Piez ee Pu-Pis3 


SP OO ae —— Dobos = Pia» Pia +.) + ete, 
X12 \X11 a X12 Lele X13 gee 


Since we are dealing throughout with the cell n,, the quadrantal suffix (i.e. the 


8—2 
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a, b, ete.) for any J, m will be the same throughout. Hence we may write the 
complete coefficient of n,, as 


Cu ee 12 13 
= ream Dan at: Sma eee te ) 
Xie ee va X12 a X13 si 
C1» ( u C1. 13 ) 
011 aE + ete 
X12 a x a X12 ig X13 0 
=( D9 + ot pa tae) (98) 
Ser a cr vie Pree y 
In the case of an enneachoric table for example we have, XC,, being = 1, 
C C C C 2 
oP = (-=" Pa Sa ee Pr) Nyy 
X11 X12 X13 X22 
C C C C 2 
+ (0 <8,Py— <2 P 2p % Py) 
Ca? ot ee ee ee 
C C C C x 
az ( tar = ae == pligae = Poo) Ny3 + ete. ...... (99), 
Xu X12 X21 X22 


the P’s being at once written down from the table on page 114. 
Or more generally thus: 


Since the P of any cell ,,, with reference to any cross point (st) is invariable it 
may be written generally as ,,P,,. This notation gives up the recognition of the 
equality of the P’s in any given quadrant but gains in generality. The quadrantal 
suffix and sign may be supplied by inspection. We have then the following lemma: 

Ssv' = B (8P5P ye) 
= PsP oe Mar + oP ster se Mar + + 12Pster2P 9 Mie + 


= >; (es Pasi Nim) Ce i ee (100). 


lm 
The standard deviation of 7, may then be developed as follows: 
(UC1) dr =X (Cor) 


Y Ay 
sf SP.) +25 Gee SPP.) 
XstXs't' 


C54\2 a Oban 
= [{s (=) Unessie alg 2m Cs Cow al ee Peet Nim 
ln Xst st Xst Xs't! 


More fully written this is 


C 
(2C,1)?o,? ae (= eden x1 eee oP a nleae + ) Nyy 
1 


Ca 5, Cla Coenen " 
ai waa + rf yo + TE aay ae Goal ecoreate o50 ooo (018), 
Xx X12 X13 
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§9. Tue Stanparp DeviaTIoN or POLYCHORIC 7 IN SPECIAL CASES. 


The value of the standard deviation when r = 0 is of interest and may be got 
as follows. 


Assuming + = 0 throughout: 


As aa € (ky) = WV? 
By, ae =€& (hs )= oF 2 
and writing m’,. = N — m,. and m’., = N — m.,, then 
M,.M. 
Gigs -F, ie ‘ , 
ite 
Pst = Ya "Se a 
m;.m’. 
cleat = <a ¢ , 
Mee M 4 
aP 51 = “N ° 
Substituting these values in the S’s we have after reduction 
1 
Sao = RE LO ATO RL iLO Cr EAR Oe ee Oa (104), 
i , / 
Siti FV Ae TL Le (105), 
1 pee 
Strom = HB Mae My Mey ceveeeeeete te te teens (106), 
I pee 
Seo Nya Miz Meg Mg Moy ceveerererereeeeeneeens (107), 
1 hep 
Sao Da Ma Mey My Mog ceveeereereeeeeeeeeen (108), 
it es : 
Sago NE Mae M ey Mig Moy ceveeveeeerrreerersen (109), 
1 aay, , 
Sipent= Shion FY a ae Le (110), 
1 ae 
Saee se — a 1s LION, PHLOL CE Dict ee Oe ee (111), 
Sorex = a (LMT Dies WOOL ETS CoE Oro ae Rect OS (112). 


From these values we get the o’s and R’s, 


me 1 On eH OG LL: 1 
On = NHK, N (= —) RRO RCE (113), 
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1 2 ae - =) 
eee ( ae (114), 
1 a. 7s oy My MW oy (- ] 
on = EEK y =) Pes. (115), 
ie ae ff ee cial 
2 = See vo ( ) “ee (116). 


[Ee ; alt =< 
Since m,.m’., = m,. (N — m,.) = No». 
Grn Ones 


[1 NYN HK’ 


and similarly for the others 


Mm. mi 
ites 1B (as 7 =e) vs. testeateee 117 
11°12 21°22 = e iN i “1M. ps (= €) ( ); 
m,.m' 
R ° = R Qe = ice = a = €.\ ence eeeee Il 
11°21 12°22 m’ 1M. (=) (118), 
7 7 
M1 .M 1M o.M . p 
Deeo ety SLL UM PH ier cocio0> 119). 
ee alae M'1.M’ 44Mg.Meg ( ) ee) 


With these values we have 
1 dy he G1 G2 


Gr IE Be. 


|. on €€' € ena 


Aooon is symmetrical with respect to the centre of the square, hence* 


lte!’ ete | |l—ece’ e—€ ; N2 Noo 

= ie =<) (15 9)" eae 

ete oe bet cell ile hen le ee” ae a); 1’ 4Mg.Meg 
Pees, (121). 


The remaining minors are easily reduced and we have after reduction and sub- 
stitution 


don ay ene (n, — ,) St (a — ah) 
°2 


on mM,.M'y. My. m.yM’ 4 

BPR eo udiso (122), 
Aooie ea <2) N5 Eee (H, ak, H.) Sy (Ce K,- K,) 
O12 M,.M'y. Mg. Mgt" oo \M oy 

woe nine aces (123), 
Ao (1 — 62) (1 — 9) 2, & jee i1,) se (K, - 7 Ks) 
O21 Ms.M'5. \M 4. MyM 4 Mag 

eth emcee (124), 
Aooee = eee) Hs) ae H, H,) Gy (A IK K,) 
Sop M,.m',. \M'. Mol’ co NNW 2 

deseswereen (125) 


* See Scott and Mathews, Theory of Determinants (2nd ed.), p. 89. 
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Summing these four latter expressions we have 
ide 2H,H, | H,? 


(N= Neogg = (= <2) = 2) NO 
0000 ( )( ) Ms. Ms.1,.  MyiM’s. 


RN en Oa Bae ie 


= l 
Mciibioee (eT te DSA anos 


: ries Aoooo 
Bae A AN 
0000 — 


OPE LOG TELM LLL 


3 He Nin eee |. | Keaqm ok Ghar is 


7 ey, 7 boas, 7 
MGM ag? MM A 8 Msg oo 


sah i 
Wane Masa © My... 


My. Mey 


Ny \He LSS Het Hoe x ky? Ee eR Re art 
ieee mM’ 5 
bee ite ) 
When the table is symmetrically divided in both categories and + = 0 we have 
M,.=m',., H, = H,, K, = Kg, etc. and the above reduces to 
Neo Neo M1 


oS Se 
NeH?K? . 4 (“2 — 1) (2-1) 4N2 2 Kee: M2 4N°H* K 
Ny. M4 M1. a 
* VN 9 
and oS a rae con rede eed a caeenee (128) 


§ 10. CoMPARISON OF THE STANDARD DEviaTIONS OF PoLycHoRIC r 
AND ENNEACHORIC fr. 


We may now compare the standard deviations of 7, and 7p. 


Se Nagi Xo X20) O% 6 — OL ay = OP tg — OP ay + OP op. coecesssecosees (129), 
C I C \ C C 5 Cu 5P Che § Cay Cx 
(Cy + Che + Coy 22) Cas 11 Pr» OP o, + — SP o. 
11 X12 X21 X22 


from which it appears as before (p. 108) that the enneachoric r is equivalent to a 
polychoric r in which the weights of the 7’s are x41, — X12, — Xo15 X25 Le. 
— X1u%11 — X12%12 — Xo1% 21 + X221 22 
Xia > Nazis Net + Nee 
Hence also the standard deviation of the enneachoric r may be written 
OP = (= aPu t+ aPie + aPan — oP og)? My + Cte. ei... (132). 
Upon expansion this reduces to 
o on le ale ele N (Ay al By oo 1) ale a 12 a N (Aqs als By» = i)| ae ete 
i aaleon a V (Aan | Ba — t) = oho, +N (A, + Bs — 1)) ™ , 
ue Le {(Ags a Ay) (Ajp A,,) (By, ay le a etc 
as (ee ere hy (Oeil sey ee or 
P,,—Py,—P jee) ae 
= lw {a — O42 + Boo — Bar a ue N ai a Ny, + ete. ...(133) 


(using a, 6 in the sense of p. 99), which is identical with the corresponding 


coefficient in o,? as given in equation (30). 


Ve 
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§ 11. FormMuLAE FoR CoMPUTATION. 


The forms found for S are not convenient for computation. The following have 
been found more expeditious. Various other formulae are also collected for reference : 


N he+k? —2rhyky 
= Nz (h,, k, 1) = ——— . 202) 3 a 134), 
Xst ( s t ) ov 1 rake ( ) 
k,— rh 
Ab oe e ( t : :) 
; V1 — 
h, — rk 
Bee (=) 
‘ V1 — 9? 


I. = Ag, oF Be = le 
Pi. = Aun. + Ban. — Me 
when no quadrantal suffix 1s used @ is understood. 
P 2 
a 2 2 2 11 FR 
Sain a, Ts Ay ++ Ay Cy + By Or ee . eee (135), 
and similarly for S,..32, etc., 


Syew= Wa Wis + By Bye (Me + M1) 


+ Ay yn Pu Pr. 
eR ee AnHcdt 6.000503 136), 
+ Ay, Ay2N45 N (138) 
Syy-o1 = TT, 1%) + By We) %1 + By Buy N31 
+ Ay Tyg. a Py Po 
eb ee No cesses buona) (137), 
S11 +99 = Th, Toga + By Wop %1 + By BooM31 
+ 4 Toate ii Bee oo (138) 
Te ee tien | 3 
Syo-01 = Th yo Tey M4, + By W111 + By Boy N33 
+ The Ante + By Aoi tee = Py2Po es ae (139) 
ee No cites ; 
Sie-2.2= — The les (a + M42) + Byy Tog (M21 + Me2) + Bye Bog (M31 + N32) 
{Pres Ph 
+ Ay, Aos%13 — aa iene eiecceteoes oa ae epee (140), 
Soy +99 = TT 5; Los (141 + Mex) + By ByyNzy 
+ Ag, Hos (Mia + Mae) Tals) (141), 
+ Ay Age (M3 + M28) N 
Xst° Fst” a Spans 
| VSie 
oy = 
Xst 
XstX se Fst Feit Retest! => Sgtes't! erololeleteteletetervicreletoisveleteteteterels (142), 
Sh site 
A eet Te 
XstFstXs't' Fs't! 
Doineies Ue ena a ae (143) 


VSien Sy os't' 
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In place of calculating o and R it will be found easier to employ the S’s directly 
by writing the equations for C in the form 


Geen LO Sire ny! Sirs aie =} 


3 12 13 
a 7 ee a er eae (144). 
@,, pha gy, Piao, ee 
ze X11X12 xa X12” 8 xieX18 


Eliminate the A by subtracting one equation from each of the others; put 
Cj, = 1 and solve by successive elimination for the remaining C’s. This is preferable 
to using the determinant as it is at least no more laborious and lends itself to various 
checks for accuracy. The A should be determined from each of the equations as a 
further check. Then we have Xr 


By putting Ch, = Xn, Cre = — X12» Cor = — X21, Co2 = X22, We may derive the 
enneachoric standard deviation from the polychoric 7 in a form convenient for 
computation in terms of the S’s, 

(X11 — X12 — X21 + a2)? Ore? = Surear + Syo-12 + Sor-01 + Soa-02 + 2 (Sir-22 + Sio-21) 
— 2 (Surerz + Sir-or + Sia+22 + Soi-22)-++ (146). 


§ 12. CoMPARATIVE RESULTS OF VARIOUS METHODS OF FINDING 7 FROM A 
3 xX 3 TABLE. 

In testing the methods developed in this paper upon actual material it was 
thought desirable to try them side by side with all the other methods of finding 
the correlation coefficient so that some indication could be got of their comparative 
accuracy. Each of the tables was therefore dealt with by nine methods which are 
indicated in § 13. These tables were selected at the beginning of the investigation, 
and had the course which the research has taken been foreseen probably a different 
selection might have been made. Two of them, I and III, are normal tables with an 
arbitrary population of 1000. In Table I the frequencies have been taken to the 
nearest integer and in III to the nearest two places of decimals, so that any irregu- 
larity in them is due to the roughness of the approximation to the true figures. 
In the 7,, we have an additional lack of approximation in taking 7, from the curve* 
for determining 7, and also in r,, ry and r, from finding the class index correlation 
from a small number of marginal groups. In IT and IV we have actual samples. 

A rough test of the value of the various methods may be made by finding 
the mean square deviation of the calculated from the “observed” value of r, each 
constituent being merely weighted with its total frequency, regarding the product 
moment values of 7 as the “observed” value. 

Thus let ,, v2 = total frequency in Tables I, II; R,, R, = product moment 
value of the correlation coefficient in Tables I, II; 7,, 7, = correlation coefficient 
calculated by one of the methods, then writing 

(my + Ng + ...) 2? = my (Ry — 174)? + Ny (Ry — 12)? + «.., 
we shall have in &? a measure of the goodness of the various methods. This gives 
the following values of &?. 


* Tables for Statisticians and Biometricians, p. lvii and p. 65. 
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Mean weighted square deviation of calculated from “observed” or product 
moment values of r. 


= >2 (omitting H) 
Mean contingency Bee see eet ty -00138 “00102 
Mean square contingency ute nao. 2 -00089 -00060 
Enneachoricr ... as ale aon eae 00364 -00036 
Polychoric r ds ste Ane sod Ry “00004 -00002 
Tetrachoric r aoe as ace sone oe -00018 -00016 
Mean tetrachoric r ae Sd eo MeaT -00005 -00003 
Mean weighted tetrachoric 7... So) PH “00002 -00002 
Three row n from mean dispersion* ... nm -00020 -00019 
Three rowy from “individual” dispersion 7, ‘00151 00144 
Marginal centroids a a oe 13 -00215 00255 


I have given the value of X? including and omitting Table H, which gives very 
anomalous results, as yet unexplained. Broadly the best results are given by 7,, 
7m and r,,, and, Table H aside, the best result is by r,. In the case of r, the results 
are not quite satisfactory. The figure given was arrived at by taking the mean of 
the raw figure from the curve and the same corrected for broad categories as 
suggested in Tables for Statisticians and Biometricians. An attempt was made to 
find an empirical formula which would give better results with the tables here de- 
scribed, but the result was not worthy of record. With three row y, although strictly 
the method is quite inapplicable to 3x3 tables, it may be useful to notice that 
when so applied the best results on the whole were got from assuming the 
distribution to be homoscedastic and using the mean dispersion of the arrays. This 
was largely due to several of the tables being divided so that some of the arrays 
contained very small frequencies which had therefore large probable errors, giving 
an undue effect on the result when squared. When such small frequencies are 
avoided the results appear to be about equally good. Of course our theory fails, 
as we have already pointed out, when any cell frequency is of the same order as 
its variation. 

Comparing the probable errors of 7,,, 7,, and 7, (tabulated for convenience in the 
Appendix on page 133) it will be seen that on the whole they are in descending order 
of magnitude. They differ very little from each other and, considering the labour 
involved in finding 7,, 7,,, would in most cases give a result with a sufficiently low 
probable error. 

The method of marginal centroids as already known is unsuited for tables with 
so few categories. 

An interesting and important relation which is not shown in the tables of 
numerical results (§ 13) is the degree of correlation between 71,, 712, et¢., Viz. 
Riz. w, Ry - 21, ete. These are collected in the table on p. 123. 

All the enneachoric tables are arranged so that reading from 7,, to the right, and 
downwards, r is positive so that the values R may be compared among each other. 
It will be seen on examination that Ry, . 42, Ry. 21, Rig. 22, Rey. 22 are on the whole 
ereater than Ry, . 9, and Ry. 9; and of the two latter R,, . 9, is usually the greater. 


* See § 13, 8. 


A. RivrcHiE-Scorr 


With regard to the computation of r, it will be seen from example appended to 
Table A that the amount of labour involved in dealing even with a 3 x 3 table is 
considerable and will rapidly increase with the number of cells, and it is very 
desirable that some short method of approximating to the weights (C’s) of the 7’s 
be devised. For the present it may be of interest to give here the C’s for the various 
tables used. 


Ry 1. Ry +o Ry. 28 

A +3418 3041 -3196 
B ‘6977 -6040 -6132 
C “4880 *4550 4751 
D -5100 -5678 -6038 
E -2180 4953 -5378 
F 4252 -4422 -4133 
G 6633 6723 -6565 
H 4395 +6592 +6567 
K +3842 +3488 +3656 
L 5014 -4831 -4798 
M -5282 -2820 -2961 
Pairs of brothers -8203 +8203 °8732 


Ray. 22 Ry 22 Ryy-a 
3450 -0608 -1466 
-7050 4048 5231 
4821 -1872 2614 
4907 2620 +3221 
1813 0830 1217 
+4307 -1209 +2392 
“6701 3414 -4769 
4367 -2391 +3219 
3882 -1124 -1632 
+5162 -2276 2611 
+5213 0865 -2128 
8732 *8837 *8756 


Cy Cy. Cx Or. 
A I -51333 -38587 ‘71892 
B 1 -26872 *19654 55240 
C 1 27000 26993 -34266 
D 1 *65737 -15786 85445 
E 1 3°79397 -16186 3°81496 
F 1 56452 -43667 *72594 
G 1 —-02973 —+13835 -69769 
H 1 -72919 33838 -79708 
K 1 -59665 -48857 -66355 
L 1 -39951 34626 -27119 
M 1 -36146 -38718 -74467 
Pairs of brothers 1 — -26399 —+26399 -82793 


The case of Table G, with negative weight for 7,, and r,, is suggestive and needs 
further study. The table has the characteristic that the mean is in n,, and the 
marginal frequencies are decreasing in magnitude and nearly equal in both sets of 
categories. The table ‘ Pairs of brothers” which is accompanied by similar weights 
is taken from Biometrika, vol. 111, 1904, p. 182, and is given below. It compares the 


athletic capacities of pairs of brothers. 


Second Brother 


First Brother 


Athletic Betwixt | Non-athletic Total 
Athletic 906 20 140 1066 
Betwixt 20 76 9 105 
Non-athletic 140 9 370 519 
Total 1066 105 519 1690 
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11 = °8046 + -0126, 
Ty = °7190 + -0162, 
1, = °7190 + -0162, 
To» = 8028 + -0132. 
Cy=1, Cy = — -26399, C,, = — -26399, C,,'— -82793. 
tf, = °8382 + -0122. 
These negative weights require further investigation, particularly the conditions 


for the existence of zero weights, but it is clear that certain divisions are to be 
avoided in determining 7 from a 3 x 3 table. 


| 


On the whole C,,, Cy, Cy,, Cy, are in descending order of magnitude. 


§ 13. PricIS OF THE METHODS OF FINDING THE COEFFICIENT OF CORRELATION. 


1. vy. Mean contingency, corrected for class index correlation. 

2. ry. Mean square contingency, corrected for class index correlation and 
where necessary for the number of cells. 

3. 1,. By selecting the central cell, the method first described in this paper. 
As its use treats any table as virtually 3 x 3, it may be called enneachorie r. 

4. r,. By weighting the 7’s so that the p.z. shall be a minimum, the second 
method described in this paper. As it is applicable to tables of any size it may be 
called polychoric r. 

5. 415712) %21, 22. Tetrachoric r of the various quadrants. The probable errors 
were calculated by the complete formula (p.z.) and also by the approximate method 
(a.P.E.). (Lables for Statisticians and Biometricians, p. xl.) 

6. %m- The unweighted mean of 73,, 712, 721; 729: 

7. % . The mean of the 7,,, etc., weighted by the reciprocals of the squares of 
their standard deviation. 

CFs Heb Mesctinpene Three row 7 calculated from each of the dividing planes as 
planes of reference with a class index correction on the foot of the columns. Since 
the standard deviation may be found in this case from the individual arrays or, 
assuming the distribution sufficiently homoscedastic, may be given the. mean 
value o V1 — 72, I have used both methods for the purpose of comparison. These 
are distinguished by the headings “individual dispersion” and “mean dispersion” 
respectively *. 

9. +,. By marginal centroids. 

The probable error of 7,, and 7,, was obtained as follows: 

Let the correlation coefficients 741, 742, -.- 
have the s.D.’s as Ghep oo 
and the weights eas Ging a 


* The probable error of Biserial (or three row 7) has now been given (Biometrika, Vol. 1x, part Iv), 
but too late for use in the present paper. 
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Then r,= 2hirin (dr)? = Datyy” (dry)? + 2De yy tye dry, dry. 


Lt, ‘ (Xt11)" ; 
ES Lt? oy" + 223 l12011 012 Ry ELI  ot Piel 2 ao (147). 
4 (ty)? 
When t,, = ty). = ... = 1 we have the mean 7, 7,,, and if there are / 7’s 
242 5 
Gee aa oe gD mee Meee Laat (148), 
which for convenient computation may be written 
Sy Sie 1 9d Sut 
a Xu" X11 X12 
= 
In finding the mean weighted r we may regard 7,, as the mean of ¢;, uncorrelated 
. 2 
i) 


; a ey 
values of r of equal weight each having the s.p. 09. Hence oy, = Wa and 14; = 
11 


ie. the weights are proportional to the reciprocals of the squares of the s.D.’s. 
_ Putting this value in (147) we have 


Cie) 


§ 14. DETAILS OF TABLES AND SUMMARY OF NUMERICAL RESULTS. 
I. The first table examined was taken from Pearson and Heron’s paper “On 
Theories of Association,” Biometrika, vol. 1x, p. 220, Table XIV, and is a Gaussian 
surface for r = -5 adjusted to give whole units in the cells. 


l l 
1 2 Sue eo 4 De One teh 8 | Total 
1 i 20 5 2 2 — 34 | 
2 21 145 79 36 10 9 1 301 | 
3 6 94 85 D4 19 22 4 | 284 
4 2 32 39 31 12 17 4 137 
546 | — 18 28 oor ie pall 18 5 105 
7 = i 22 24 12 22 q 98 
8 = 2 Gapiposse oe |= 13 7 41 | 
Total 36 | 322 | 264 180 69 | 101 28 | 1000 | 
| | I | 


The frequency in heavy type contains the mean of the surface. 
A. Table I divided so that the mean falls in cell n,.. 


1+2 34+4 54+6+4+748 Total 

1+2 193 122 20 330) 
3+4 134 209 78 421 
5+64+7+4+8 31 1183 100 244. 
Total 358 444. 198 1000 
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ty = 482 
r, = 4840 + -0170 A.P.E. 
Ty = 50346 + -02094 \ 74, = -498 + 02872 (-0290) 
r, = 48594 + -04918 To = -510 + -03210 (-0303) 
Tm = 5050 + +0246 [ 1, = -508 + 03505 (-0321) 
%, = 5045 + -0211 Too = °504 + -03259  (-0340) 
r, = °5145 
Mean Individual 
dispersion dispersion 
hy 5031 4950 
Ns “5057 4975 
hy 5045 -4955 
Nhe 5058 4949 


I here insert as an illustration of the new method the constants required in 
finding r, for the above table, and the calculation of S..;, Py, and II,,, and the 
equations to find the C’s. 


Table A hyk, lyk, hyky hake 
, 498 510 | +508 504 
h —-36381 ~-36381 -84879 84879 
rh ~-18118 _-18554 43117 42779 
Eh 3733945 3733945 2782707 2782707 
k _-42615 69349 — 42615 -69349 
rh ~.21222 35368 ~ 21648 34952 
Ek 3643145 3136735 3643145 3136735 
Reva ~-15159 ~-71749 1-06527 49927 
h—rk 
Ss _ ~-1748086 ~ 8341218 1-236735 -5780570 
h-rk 
Ey | 3928931 2817266 1856865 3375594 
h-rk 
B=€5—— 4306151 -2021063 -8919072 -7183872 
ese ~-24497 -87903 ~-85732 -26570 
k-rh 
aa ~ -2824913 1-021920 ~ -9953132 3076287 
k—-rh 
i, 3833377 -2366676 2431048 3805048 
k—-rh 
A=€7—, 3887835 8465906 -1597920 -6208174 
X- 165-0602 102-7353 78-53720 122-5923 
I —-1806014 0486969 0516992 3392046 
P 90-44050 128-8716 111-9422 383-0020 
ee -001812696 000692554 -0006728030 -0001333663 
SyalX 2X - 002265285 (003626002 0007349840 
Ribose = a 002700292 0008662932 
So2/X xX = = — 002335123 
C: 1 51333 3859 71892 


* The suffix : indicates that appropriate suffix is to be taken from the column, 
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Calculation of Sy2.01- 
Tl, 0486969 Bis -2021063 
TI,,  -0516992 An -1597920 
1 485895 Me «209 -74.9650 
Bes -2021063 Ay -8465906 
TI,, 0516992 A -1597920 
Mm 122 1-274745 a ae Ga 4-193630 
By 2021063 17-351825 
By "8919072 P,, 128-8716 
me 20 Secon Py 111-9422 
a eo 14-426170 
IL. 0486969 2.925655 
An _—_*1597920 ag 107353 
My, 134 1-042704 Xen 78-5370 + 8068-543 
= +()003626002 
Calculation of Py. Calculation of Ih,. 

A,, _°3887835 Ay, 8887835 
m,. 358 139-1845 By *4306151 
B,, 4306151 -8193986 
m.1 335 144-2560 1 

283°4405 IT,, = — :1806014 
My, 193 
Pu = 90-4405 


Equations to find C, 
°001812696C,, + -000692554C,, + -000672803C,, 
0006925540), + 0022652850, + -000362600C,, -++ 
--000672803C;,, + -0003626000,, + -0027002920,, + -000866293C'x. = A, 
:000133366C,, + -000734984C,, + -000866293C,, + -002335123C,, = A. 
The solution of these equations gives the first row of figures in the C table on 
page 123. 


-000133363Co. = A, 
0007349840 = A, 


B. Table I divided so that the mean falls in cell n,,. 


14243 4 5+64+7+4+8 Total 
14243 462 92 65 619 
4 73 31 33 137 
5+6+7+4+8 87 57 100 244 
Total 622 180 198 1000 
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ly = 533 
rg =°510 + -0161 A.P.E. 
ty = 5007 + -0250 ) 71, = -499 + -028 (-028) 
r, = 5073 + -1487 | 7. = -501 + -030 = (-030) 
%m = °5010 + -0254 | 72, = -500 + -031 = (-032) 
Ty = °5008 + -0253 } eo. = -504 + -033 = (-034) 
vr, == °5445 
Mean Individual 
dispersion dispersion 
Ne, 4921 4873 
Nea “4917 -4763 
Th, -4858 -4802 
nh, 4881 -4671 
C. Table I divided so that the mean falls in cell »,,. 
14+2+3 44+5+6 7+8 Total 
14+2+3 462 121 36 619 
44+5+6 119 79 44 242 
7+8 4] 49 49 139 
roca ; 622 249 129 1000 
ly = 537 
ry = °5183 + -0202 A.P.E. 
Tp = 4988 + -0241 \ 71, = -499 + -028 (-028) 
r, = °5055 + -0652 | 7,, = -490 + 035 (-035) 
Tm = *4985 + -0253 | 7, = -505 + -035 _ (-035) 
Ty = °4973 + -0311 Too = 500 + 039 = (-043) 
7, = °5480 
Mean Individual 
dispersion dispersion 
Ney 4895 -5041 
"ky 4934 4615 
"hn, -4846 4730 
ns -4947 4496 
D. Table I divided so that the mean falls in cell 14». 
14243 4 5+64+7+8 Total 
eo 277 38 20 335 
3 185 54 45 284 
4+54+6+7+8 160 88 133 381 
Total 622 ; 180 198 1000 


EH. 


The mean is in cell ny. 
| 
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Mey 
Nhe 
hy 
Nhe 


0631 
*0235 
2330 
+ :0239 
0236 


741 = ‘501 + -030 
Te = -499 + -028 
Loy = -508 + *035 
Mean Individual 
dispersion dispersion 
-§211 -4798 
*4885 -4907 
-4956 -4709 
“5115 +4553 


Table I divided so that the frequency of ,, differs very little from the 
frequency of a table with the same marginal frequencies but of zero correlation. 


r 


he 


Here 


142 3 44+5+6+7+8 Total 
1 27 5 34 
2 166 79 é 301 
34+44+546+7+8 165 180 320 665 
Total 358 264 | 378 1000 
301 x 264 : ; é 
000 79-464 so that the constant term in the equation for r is a 


small quantity and any error of sampling will have an excessive weight. It will be 


found as one might expect that the p.n. of 7, is very large. 


The very large value of 7, is due to the column having the marginal total 378, 
for the frequency 2 in it is the nearest whole number to a true value and being 
so small, a small absolute difference makes a large fractional value resulting in 
a large difference between the true and apparent standard deviations of this par- 
ticular array. Actually the method applied is inapplicable to a frequency of this 


order. 


Vy 
Vg 


_ Biometrika x11 


502 


-4827 + -0246 
4991 + -0245 > 
4658 + -4371 

-4995 + -0327 To = 
-4995 + -0247 } ry = 


5065 


Ny 
UL) 
hy 
Nh 


Usk 


UG} 


Mean 


dispersion 


-4862 
-5169 
-4950 
-4966 


500 + 056 
-498 + -029 
500 + -072 
500 + -030 


Individual 
dispersion 


-4906 
-4991 
+5022 
‘7150 
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II. The second table examined was taken from Macdonell’s paper “ On Criminal 
Anthropometry,” Biometrika, vol. 1, p. 216. The original table is too extensive to be 
given here, but may be found in loc. cit. The horizontal categories are the heights of 
3000 criminals in feet and inches, and the vertical categories the lengths of their 
left middle fingers in millimetres. The correlation coefficient found by the product 
moment method is -6608 + -0069. 


F. Table II divided so that the mean falls in cell nyo. 


| 55,%,”-64,%.”” 64,9,/"-66,9,” 66,5,”-77” | Total 
* | 
9-4-11-3 mm. 682 270 101 1053 
11-4-11-:7 mm. 282 351 286 919 
11-8-13-5 mm. 90 299 639 1028 | 
Total 1054. 920 1026 3000 
ry = °6635 
t, = 0170 4-000 A.P.E. 
r, = 6544 + -O101 \ ry, = 667 + 013 (-014) 
r, = 6316 + -0301 | 7,. = -670 + -014 (013) 
Tm = 6530 + -0101 [ 7, = -644 + -015 (-014) 
r,, = 6538 + -O101 } rag = 631 + -014 (-014) 
Ooi 
Mean Individual 
dispersion dispersion 
"hy -6477 -6295 
Nky -6548 -6306 
nk, -6647 6151 
ho 6345 -6510 
G. Table II divided so that the mean falls in cell n,,. 
55%,”-65,%," | 65,%,"”-66,%," | 66.9,”-77” Total 
oe el = a - r os = 
9-4-11-5 mm. 1122 176 216 1514 
11-6-11-7 mm. 191 96 171 458 
11-8-13-5 mm. 203 186 639 1028 
Total Tst6n a das 1026 3000 
Vy = “731 
ry = 6426 + -0077 A.P.E. 


t, = °6613 + -0108 \ 7,, = 680 + -012 (-013) 
r, = °6808 + “Ne | 11. = 668 + 013 (-013) 

+ -0111 Lee 642 + -014 (-014) 
ry = 6573 + -0112 } 9, = -631 4-014 (-014) 
7, = (182 


= 
3 
I 
a 
Or 
OO 
eo 
! 
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Mean Individual 
dispersion dispersion 
6553 642] 
6473 6546 
“6657 -6669 
6277 “7113 


H. Table II divided so that the mean falls in cell 15. 
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55 yp/-654%;"" | 65;%4"-66;5" | 66y4"-77" | Total | 
9-4-11-3 mm. 840 112 101 1053 
11-4-11-7 mm. 473 160 286 919 | 
11-8-13-5 mm. 203 186 639 1028 
Total 1516 458 1026 3000 
ry = -669 
re = 6172 + -0088 A.P.E. 
ry = 6479 + -0107 \ ry = -648 + -014 (-014) 
r, = -5162 + -0438 | 7, = -668 4-013 (-013) 
1m = °6478 + -0108 [ ro, = -644 + -015  (-014) 
ty = °6591 + -0107 } rap = 631 + -014 (-014) 
r, = °7920 
Mean Individual 
dispersion dispersion 
Ny 6539 6279 
No 6472 “6156 
Mh, -6479 6103 
Nho -6345 6418 


Ill. The third table examined was taken from Pearson and Heron’s paper 
“On Theories of Association,” Biometrika, vol. 1x, p. 219, Table XIII, and is a 
normal surface having r= °:3. 
decimals were used. 


The values of the frequencies to two places of 


1 2 3 4 5+6 7 8 | Total 
1 4-04 17-16 7:55 3°30 0-91 0-92 0-12 34 
2 17-41 123-59 79-76 46-64 14-61 17-67 3:32 301 
3 8:86 93-00 78:31 52-04 19-20 26-40 6-19 284 
4 2°83 37-73 37-24 27-51 10-95 16:31 4:43 137 
5+6 1-62 25-21 27-75 22-09 9-26 14-64 4-43 105 
Z 1-02 19-50 24:47 21-39 9-58 16-36 5:68 98 
8 0-22 581 | 8-92 9-03 4-49 8-70 3°83 4] 
Total | 36 | 322 264 180 69 101 28 1000 | 
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K. Table III divided so that the mean falls in cell np. 


Table 


142 3+4 5+6+7+4+8 Total 
1 49) 162-20 | 135-25 | 37-55 | 835 
34+4 142-42 195-10 | 83-48 42] 
5+64+748 53°38 113-65 | 76:97 | 244 
Total 358 | 444 198 | 1000 
ry = +3088 
Tg = °2960 + -0199 A.P.E. 
Tp = 3000 + -0246 )\ 7,, = -300 + -033 (-034) 
7, = °3000 + -0991 | 72 = 300 + -036  (-035) 
'm = °3000 + -0249 } 11 = °300 + 038 (-037) 
Ty = 3000 + -0247 J r.. = -800 + -038 (-039) 
fo — 3025 
Mean Individual Mean Individual 
dispersion dispersion dispersion dispersion 
A “3011 +2988 7h +2998 +2997 
Nie -2999 -3003 ie -2989 2989, 
L. Table III divided so that the mean falls in cell n,;. 
| 14+2+3 44+5+6 7+8 Total 
14+2+3 429-68 134-70 54:62 619 
44546 132-38 69-81 39-81 242 
7+8 59-94 44-49 34:57 139 
Total 622 249 129 1000 
Py => °330 
| ry = °3095 + -0225 A.P.E. 


ry = *8000 + -0282 ) 7,, = -300 + 032 (-033 
r, = 3000 + -1031 Tyo = -300 + 039 ( 

= -3000 + -0326 15, = °300 + -040 (-041 
Fy = +3000 + -0285 } reo = -300 + 046 ( 


r, = “8152 


Mean Individual 

dispersion dispersion 
Nk, +2965 -2975 
Ne -2910 +2898 
Mh, +2953 +2975 
Nhe +2909 2871 


IV. The fourth table examined was from Pearson and Lee’s paper “On the 
Distribution of Frequency (Variation and Correlation) of Barometric Heights at 
Divers Stations,” Phil. Trans. A, 1897, vol. 190, p. 453, Table IX. The original 
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table is too extensive for reproduction and may be found in loc. cit. A condensed 
form of it will be found in Biometrika, vol. 1x, 1913, p. 223, Table XVIII. 

This was selected as an example of a very skew distribution. The correlation 
coefficient found by Product Moments is :780 (Biometrika, vo]. 1X, p. 223). 


M. Table IV divided so as to give a reasonably large frequency in the cell gp. 
The mean falls in the cell ny,. 


| 30-1” and over | 30’’-29-8’” 29-7’ and under | Total 
29:9’ andover | 1086-5 | 412 43 1541-5 
29-8”-29-7”” | 144-5 275 103 522°5 
29-6” and under | 56-5 323 478-5 858-0 
Total | 1287-5 | 1010 624-5 2922 
fy = 189 
Tg = -7504 + -0210 A.P.E. 
ty = *7864 + -0077 \ 14, = +780 4-010 (-019) 
r, =*7745 + -0151 | 74. =-787 4-011 (-011) 
im = *T8TT + 0078 f 75, = -795 + 012 (-011) 
ly = *7858 + -O0077 To = -785 + -O11 ( 012) 
r, = 8770 
Mean Individual Mean Individual 
dispersion dispersion dispersion dispersion 
Ne, *7857 ‘7116 Nhy -7962 “7417 
Nhs -7812 -6951 Nhe “8065 6841 
Appendix. 
Probable errors of 7m, 7», p- 
P.E. of P.E. of | P.E. of 
| arithmetic weighted polychoric r 
| mean (7) mean (7,,) (7) 
ie 
A | -0246 0211 0209 
B 0254 0253 -0250 | 
C 0253 ‘0311 0241 | 
D 0239 | -0236 0235 
E 0327 0247 0245 
iF ‘0101 0101 ‘0101 
G ‘0111 “0112 ‘0108 | 
| H -0108 ‘0107 -0107 
| K -0249 0247 0246 
L 0326 0285 0282 
M -0078 0077 0077 


My thanks are due to Professor Pearson, who suggested the enquiry, for his 
ever ready help and advice throughout the work. I have also to thank Miss Alison 
Robertson for assistance in reading the proofs. 


ON A FORMULA FOR THE PRODUCT-MOMENT COEFFICIENT 
OF ANY ORDER OF A NORMAL FREQUENCY DISTRIBUTION 
IN ANY NUMBER OF VARIABLES. 


By L. ISSERLIS, D.Sc. 


1. In Biometrika, Vol. XI, Part III, I have shown that for a normal frequency 
distribution in four variables, if 


Pxyet = SSSS {Newt xyz} /N 
zye2et 


denotes the product-moment coefficient of the distribution about the means of the 
four variables and q,,,; 1s the reduced moment, 1.e. 


Vayet = Diyat| Ga Oy Or One 
then Veiyet = Tay Vee Vie lnk oe Te tae oe enaenc onsen ote eee (1). 
In this result any two or more variables may be made identical leading to a 
variety of results for moment coefficients of distributions containing fewer than 
four variables but of total order four, for example identifying ¢ with « we obtain 
Gsejg Vga DT ey Neg 0 Soke Tocca eee (2), 
and putting y=z=t=¢@ we find g,:= 3; of course q,, = 1,, and q,: is merely By. 
I suggested that (1) was probably capable of generalisation, and I now propose 
to prove a general theorem which gives immediately the value of the mixed moment 
coefficient of any order in each variable for a normal frequency distribution in any 
number of variables. 


2. Consider a normal distribution, total population N. Let Ny... denote the 
frequency of the group in which the characters differ by 7,, %, ... 2, from the mean 
values for the whole population and let 

Pit ole pln = S (Nya. nha 2” --- 0) [Nooo nce soe ate (3), 
denote the moment coefficient of the most general kind about the mean values of 
the characters. The corresponding reduced moment will be 


=e; l 1, l 
G5, la ete = Palisa (ita) Ta Gees ny, etree ae (4). 
Then for normal distributions, 
It: beodds: gait, On ca nea ase ts eee a eee (5), 
andi if.m beeven,. “Gist n= 0S Canlica cena) eee eee (6), 


where the summation on the right-hand side extends to every possible selection of 
n/2 pairs ab, cd, ... hk, that can be formed out of the n suffixes 1, 2, 3, ... n; equa- 
tion (1) is thus a particular case of (6). 

Equation (6) is the theorem it is proposed to prove. The value of qyhol.., nln 
is at once found for given numerical values of the indices 1,, l,, ... |, by writing 
down (5) for 1, + 1,+...+1, variables and identifying the values of 1, of them with 
that of the first and so on. 
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For example if we require the value of 9,252.2 we commence with 
12223 ; 
qizsase = S (Tap cal es) 
= 112 (1'3a1's6 + 135746 + 136745) + 113 (724756 + 1257 46 + 12675) 


+ Tq (193756 + 1957'36 + 26735) + 115 (723746 + 724136 + 126734) 


+ "16 (Tos Tas ++ Toq l35, + Tos tsa) Cece recor cre rcccee sve rer erereeseseseeeeese Gi: 
Identifying 4 with 1, 5 with 2 and 6 with 3 we find at once 
Gig? tei te aege oe OTigghe 1 Ol plage «ve es enen dss saaus (8). 


3. We note first that q. which in the more usual notation for distributions in 
one variable is p/p.” is known to have the value 1.3.5... (m — 1) when n is even. 
As regards 8 (rgy?cq--- Tnx), Hf all the n variables are made identical, each term 


becomes unity and the number of terms is the same as the number of ways of break- 
ing up an even number (n) of objects into (v/2) pairs. This last number is clearly 


n! n—2! 4! 
so Be rey OPA 
2!n—2!2!n—4! °° 2!2! 
which also reduces to 1.3.5... (n — 1); thus equation (6) is correct for this par- 
ticular case. 


Secondly let us consider the value of qn-1.. The mean value of 7, for a given 
value of 21 18 1y902%,/0;, let 


o 
Ly = Typ at, + Xp. 
O71 


Then the distribution of X, fora given value of 7, is itself normal and its kth 
moment is zero for an odd & and 


1.3.5... (k—1) (05) 


for an even & where ,o, is the standard deviation of 2 within the 7, array so that 


paras ae 
1027 = (1 — 742°) 07. i 
Qx"-1g = —y-y7~ Mean value (2,""1 2) 
Cy Op 
1 H Gp 2 
= = Mean <7," Mean (7,,—@, + X, 
oy" 02 oO; 
aT a sea leteoie ares (20 — Bl Vine eatec 20 UC ci wineras cee veces ote (9); 


The method employed in the original proof of equation (1) is not convenient 
for generalisation and we will now prove the equation 


Yio3a = 112734 1 13% 2a + yal 23 


by the method that leads to the general case. 


O oO r 
Putting as above Le = Tig — yt Xe, 
Oy 
op. 
fy = 113 —~ 11 + Xz, 
Onl 
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we have 
594 — Mean of (012,25 %,) 


= Mean of {x, (Mean of x, 7,2, for a given value of 2,)} 


= Mean of E {Mean of (1 ee X,) (m1 Stee X,) (ru S42, + x,)t|. 
rl O71 O7 
Now for normal distributions (and if the original distribution is normal, so is that 
within the x, array), Mean X,=0, Mean X,X,X, = 0, while 
Mean X,X3 = (109) (103) 1723 
— 93 (123 — 112118) 


i ‘ wana 
=V1 = 12 0,V1 — 32 : 
12 92 18) j= Vise N/a 113° 


= (Tog. — 19715) CoOg. + <n0 stacnusde aside sneer aunese sheen eee (10). 


Hence 


3 
( os Ly 
Pizza = Mean of E [aetas?ia 020304 ae + 11202 _ (734 — T1314) 3% 
1 1 
, Zane ee 
+ 1303 Fo. (724 — T1214) F2% 
1 


: Li, 
+ 14404 — (123 — T1213) Fost | > 
07 
or dividing by o,020304. 
Yiosa = M127 1371409 + Ge (12 ("3a — 113714) + 713 (12a — VM 127'14)} + 11a (723 — 112718) 
= 110134 + To3?1q + 1147 28> 


since gz = 1 and qsi=3. Thus our formula is established for the case of four 
variables. 


4. We will establish the case for n variables by induction, and it will be con- 
venient to denote by 44o34..., the value of the reduced product-moment coefficient 
for the variables 2, 3, 4, ... n within the x, array so that 
Mean value of (X,X,... X,) 

(102) (193) «++ (an) 


where X,, X3,... X, denote as before the deviations of the variables from their 
means within the z, array. Of course when v is even, 


19234 259 


19234...n 18 Zero since n — | is now odd. 


Let n be even and assume that our formula has been proved true for all even 
values of n up to n — 2 inclusive, then 


aon = eam (rds Vane) 


= Mean {ry (m1 s a X,) (r1005 z ap X,) ee (ring a x,)| 


= 115715 -.- Tin 0905 ... 0, Mean (@;")/o,"=" 

+ 8 {(ryaTwl 10 ---) (GaOpOe ---) Mean (X, Xg)} Mean (2,"-*)/o,"-* 

+S {(rig%wTie --+) (Cg 0pG, ---) Mean (X, XpX,X5)} Mean (a,"~)/oy"— 
piaeee 

+S {71,0 Mean (X,X, ....X,)} Mean (a12)/Gq252-3-- sues sorcese eee (11), 
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the summations in each line extending to all possible permutations of the suffixes 
2,3,4,...”. The last line for example being 


Mean (7,") 


O71 


{74909 Mean (X,Xq... Xn) + 11303 Mean (X,X,Xg... Xn) + «.- 
4-47.76, Mean (X_X, ... Xp-1)}. 
Now we have seen that Mean (X,X3) = (93 — 719713) @203- Similarly, 
Mean (X,X3X4X5) = (192) (193) (104) (15) (172305) 
= (402) (103) (104) (105) [(a'23) (245) + (435) (2a) + (a"28) (7°34) 
= ("93 — 112713) ("45 — Mra 15) + (35 — 713715) (124 — T12714) 
+ (125 — T1215) (73a — 713%); 
and our assumption of the truth of equation (6) up to (n — 2) variables will enable 


us to write down the mean value of every product of X’s occurring in (11). 
Dividing by 0,0, ... 0, we have, remembering that Mean 7,"/o,"is 1.3.5...(m—1) 


Gree elas. Ty) leo O ... (7 — L) 
+S {fallic «++ (fap — Nie te)} 1.3.5... (m — 3) 
Setahitie ee tas — als) (ys —tislis) |} lsd Or..6 (m — D) 
oe hee 
ES ftiaS (Gap — tials) (ys — TiyT8) (Ten — Tietip) ---]} 1.205 (12), 


where S’ refers to permutations of aBy ... only, and S to permutations of all the 
suffixes a, b, c,... a, 8, y..., 1.¢e. all the suffixes 2, 3, 4, ... n. 

It is clear that when the right-hand member of (12) is completely expanded 
no terms can survive which contain as a factor more than one correlation coefficient 
with suffix unity. This is easily verified in simple cases, and if in the general case 
a term 7015-7 ao --. Survived, this term would reduce to 7,.\ when we identified 
the characters a, 2, 3,...”, which contradicts the value 1.3.5...(n— 1)”, we 
have already found for it (equation (9)). 

The value of the right-hand member is therefore easily found by neglecting all 
terms containing more than one such factor. 

Hence on the assumption that (5) is true for all values of m up to (n — 2) we find 


9rog..n = S {tiaS” (Tap Tyaep ---)hs 


but this is exactly the formula we wished to establish for it is obvious that 
S (apt ea ++» Tnx) Where abe ... k is a permutation of 12 ... n is equivalent to 


Sag) (aptys «--)} 


where a, a, P, y ... isa permutation of 2,3, 4, ... 2. Thus our formula which has been 
proved true for 4 variables is seen by induction to be true in general. 


5. Formula (6) can be exhibited as a multiple definite integral: Let A denote 
the determinant whose Ath row consists of the elements 


(Tks Yon <=- Th—1, «+ Met, b> -+- tnt) 
and let A, denote the cofactor of the element in the Ath row and kth column. 
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Aap, Bre Lp, 2 
a ah no np 
Let x >) ( A One T 2D Anz meal 
and Sa Ter mae a SS 
(21)? oy 05 ... Gp VA 
+n fn a 
then | | | Lu Bohs. By ZAG Ae... 00a — ON tan eae eae) eee (13), 


where a, b, c, d, ... u, v are the suffixes 1, 2, 3, ... . in any possible order. 
It is clear that (13) will enable us to write down the value of the multiple in- 


tegral | Pe- dz, ... dz, where P is any polynomial in 21, 25, ... Z, on Q a positive 
a 


quadratic form. 


In fact, let La, n%_? + 2UA pL pVq, (Aq = Uqy) be a positive, definite, quadratic 
form, then 
ee) DQ me) 
W= we | 4% Po ... Ly MEXP — £ (Uy Ly? + 2UGp 9% pL,) dU dty... OL, 
1 fe ie I ANGRY op HOD 
ae ae Pca ee na 
nt a Y ane | Nees =O Ops? <2 Gn 
(227)? Gy 9.25 ..0y VA 
I L Fae 
eX — 54 ( EA gy to + OSA, ma) dy. a 
2A \ On Oy 


=D [rev%ea «+» Tar] Where abc... hk is any ee of the a, + a+ ...+ @, 
suffixes of which a are equal to 1, @ are equal to 2 and so on. 

Let D denote the determinant of the quadratic form and D,, the cofactor of 

a,, the two multiple integrals will be identical if 


22 2 
Oe Crear ont CHP AUD) oc 


] = O7 
Tae f = Gq Oats On Ora ND pee 
Hence rp.2 = [DyqP/DppDoq and o,2—= D,,/D while A = D°/D,,Dy ... D 


n 


sosthiat W- aia “3D, Die. Dean (13), 


nn? 


where a, b, ...h,k is a permutation as above, and m=a,+ a,+ ... + @, is even. 
W =0 when m is odd. 
As an illustration of this result: 


ie fe i (Ma? y?z? + Na*yz) 


f —Ol —O! —O 


exp — 4 (aa + by? + cz4+ eke + 2gzx + 2hay) dxdy dz 
3 / 
_ (27 Oy gran + 2AF? + 2BG2 + 20H) +27" y eH + AP), 


A7/2 ~ A512 
where A, B, C, F, G, H are the cofactors of a, b, ¢, f, g, h in 
Perea ee load 20) 
N= she Bose 
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A cognate result is discussed by Mr Arthur Black in the Transactions of the 
Cambridge Philosophical Society*. Black’s integral is | Ve-Ydz, ... dz, where V 
“nn 


and U are any quadratic functions, the only restriction on U being that it should 
be essentially positive. Other particular cases have been dealt with in the paper 
previously quoted, and for the case of two variables several results are given by 
Mr H. E. Sopert. 

For reference we add a table of values of the reduced product-moment coeffi- 
cients that occur frequently in formulae for probable errors and similar work. 


qui = 3. 
G2 = 312. 
Que = 1 + 274)". 
Ge03 = 23 + 212113. 
Qus = 15. 
sg = 1572. 
Guioe = 3 + 127457. 
Guia: = I yo + 679°. 
yes = 3 (M3 + 223712 + 2713712"). 
Gitog = 3 (P93 + 12749743). 
Grzo2g2 = 1 + 2rg3? + Qrgy” + 2ryo? + Bri27 23751. 
Gs = 105, dire = 1051, dysg2 = 15 (6ryQ? + 1). 
Gaso2 = 15 (414. + 3749). 
dutos = 3 (8r yt + 2477p? + 38). 
Ptrog = 1.3 ...A — 1 (199 + AT i217 15): A even. 
Qyrorg = 1.3.5... (A — 1) 15? P 43 + 113 + W979]. A odd. 


For the case of two variables we add the following formula which is easily proved 
by the methods employed in this paper. 


duvar = ap (w+ 0) 7° + (5) (2) ob (w+ 0 — 2) 2 (1 — 28) 


+ (Cb Mb u to sy rr — +t 
the series terminating. Here 
b (2m) = 1.3.5... (2m —1) 
is) Oe) Ol) 


and 


m = m! 


* Vol. xvi, 1898, pp. 219—227. 
t Biometrika, vol. 1x, p. 101. 
t This is virtually the formula (xxxii) employed by H. E. Soper, l.c.a. corrected for some misprints, 


ON THE MATHEMATICAL EXPECTATION OF THE 
MOMENTS OF FREQUENCY DISTRIBUTIONS. 


By PROFESSOR AL. A. TCHOUPROFF of Petrograd. 


INTRODUCTION 
I 


(1) One of my pupils, O. Anderson, in a brief exposition* of his researches on- 
the Variate Difference Correlation Method in Biometrika (1914), draws attention 
to the superiority of the method of mathematical expectation over the methods 
usually employed by English statisticians. The small popularity enjoyed by the 
method of mathematical expectation in England is not of course accidental. 


” 


English scientific tradition rejects the concept of “mathematical probability. 


From the time of R. L. Ellis and of the first edition of John Stuart Mill’s 
System of Logic, the logician’s basis of probability has, in England, been the notion 
of empirical frequency. English mathematicians have followed the lead of the 
writers on logic in their preference for the idea of statistical frequency, and the 
method of mathematical expectation has naturally shared the fate of the concept 
of mathematical probability on which it rests. 


Notwithstanding its deep-rooted historical basis, English statisticians should 
break with this tradition. The substitution of statistical frequency for mathe- 
matical probability does not obviate the logical difficulties in laying the foundations 
for a statistical study of Causation, but merely shifts them elsewhere. The gain 
from the point of view of philosophical representation is sufficiently doubtful, 
while from the purely mathematical point of view the rejection of the ideas of 
mathematical probability and mathematical expectation is accompanied by very 
substantial disadvantages. Verbal formulation becomes very complicated, leading 
to loss of economy of attention: it 1s continually necessary to speak of “the statis- 
tical frequencies which would become established if the number of occurrences 
were infinitely great.” The absence of a sharp distinction in terminology between 
statistical frequency in the exact meaning of the term and those quasi-empirical 


* Anderson’s research was carried out under my supervision in the statistical seminary attached to 
the Economics Department of the Petrograd Polytechnic Institute; the results he obtained were to 
have been published in extenso in the Proceedings (Students’ Section) of the Economics Department, 
but the War drew Mr Anderson away from his scientific pursuits to other work of a more practical 
character and the complete publication of his researches had to be postponed. 
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“frequencies which would become established in an-indefinitely great number of 
occurrences” often fails to make the very statement of the problem clear to the 
reader, and occasionally it would appear, to the author: when reading published 
papers one not infrequently feels that the author does not give himself a full 
account as to what it is he is really calculating. 


Little harm follows so long as the problems dealt with are comparatively 
simple. But at the present time there are problems waiting for solution which 
are so complex that the slightest obscurity in their formulation threatens to 
become a source of error in the final deductions. 


When we start with “mathematical probability ” and “mathematical expecta- 
tion” as a foundation we substantially simplify the mathematical exposition. The 
logical analysis of the conclusions to which we are led is not injuriously attected 
by the substitution of one set of terms for the other during the calculations. 


(2) If the variable magnitude XY can take the values &,, &,... & with proba- 
bilities p,, p2, ... px, I call the system of values &, &,... & and the values 
Pr» Po ++» Pe associated with them “the law of distribution of the values of the 
variable X.” The law of distribution of values lies at the base of empirical 
“ frequency curves,” just as the mathematical probability of an event les at the 
base of its statistically established frequency. 


Denoting by the symbol HX the mathematical expectation of the variable 
magnitude X, we have as is well known: 
k 
EX => ;&:; 
j=1 


z 


where p=. 


Me 


I call the variable magnitudes X, Y, Z,... mutually independent, if the law of 
distribution of each of them remains one and the same whatever values are given 
to the others. In this case HX remains constant for all possible values of the 
variables Y, Z, .... 

If the law of distribution of X does not remain the same for different values of 
Y, Z,..., the variables X, Y, Z,... are mutually dependent. The mathematical 
expectation of the variable Y on the supposition that Y has received the value n;, 
Z the value €, etc., I denote by E(t S -)X and call it the “conditional mathe- 
matical expectation of X on the supposition that the remaining variables have 
received definite values.” 

It follows from the definitions that 

B(X4+V+Z+...)=EX+ EV + EZ+... 


both in the case when the variables are mutually independent, and when they are 
correlated, and that 
Ha EXOV7 a...) — (EX CRY). :, 


if the variables are mutually independent. 
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Lt Y e 
In the case in which XY and Y are correlated we have: 


k - 
EXY = p,£, EY, 
j=] 


Je 


k bs 
EY = p,E®Y. 


t=1 


1M 


(1) In investigations in the theory of probability we frequently have to deal 
with expressions of the type: N(N—1)(N-2)...(N—k+1). Following the 


example of Capelli*, I use the shghtly modified notation : 


Pa Agee py a 
N(N-4Y)\(N 42)... MN +h-1) = No) ee 
k 
Let Ne= > y,NC4 
i=1 
hep fete Shenae octe carmen (2). 
NE = & (-1) 


J 
The coefficients I have denoted by a, 8 are beginning to play an important 
part in the theory of finite differences+ and are of the first importance in all 
investigations into the law of large numbers. Their properties were first studied 
systematically in Chapter III of Cramp’s well-known work, Analyse des réfractions 
astronomiques et terrestres; some of their properties were discovered by investi- 
gators studying Bernoulli’s numbers; recently they have received the attention of 
the Italian mathematical school associated with Césaro and Capelli. The methods 
I employ to solve fundamental problems of mathematical statistics are directly 
founded on certain properties of the a, 8 coefficients. In view of the fact that I 
shall later on frequently make use of these methods, I state here, without proof, 
those properties of the coefficients a and 8 that I shall have to quote in the 
present paper?. 


(2) We have: 


a1. = 1 | 

Ce tot Gl Cen ln nossa nnhhccnodammnonnisadoaautroonatnpaeBandden jodgb0 5c 200002758 (3), 

On i= Ope a + Okara | 

Fe Os eas ChOSN Paes DiGi & Oe ay 
: Sere a Wer Aa 


* Vide Capelli: ‘‘ Instituzione di analisi” and the same author’s ‘‘L’ analisi algebrica e |’ inter: 
pretazione fattoriale delle potenza.” (Giornale di matematica di Battaglini, Vol. xxxt.) 

t+ Cf. A. A. Markoff, Calculus of Finite Differences (2nd edition). 

+ Readers interested in the proofs of these properties, many of them established for the first time by 
myself, will find a complete analysis in my paper in the Proceedings of the Petrograd Polytechnic 
Institute. 
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Putting es ee She ert ates 


i.e. denoting by C;’ the number of combinations of /: clements / at a time, we may 
express 4%, ;—, 1n the form : 


n—-1 
Ob, k—n = = A Of eater irr ever vayerelafeisceseve\eis/a\9.si01s\a\eio\s (elaie:s (5), 


7=0 


where the coefficients A,,; are independent of /: and are defined by the relations : 


An o=(2n—1) Ana 
App (i=) Agere (21g aL) | reat acelin (G6). 
Annet Ln aa 
Hence 
alo... (2% — 1) \ 
Hee bd. >...(2n—1)4[n—1] | 
A= 1.3.5...(2n—8) {k[n— 9 +2[n —1]-3} | 
Ans=1.3.5...(2n—8) {A [n— 2-41 4+ 4g [n — 2] + i [n -— 2IP} " 
ea edD)...(20— 9) (aeq(n — 2] 4+ [nm — 2) 9) + le [nn — rote | oe 
+o a2) 
Ans=1.3.5...(2n—5) {rpg [n — 8 + ghey [n — 8]-9 +44, [mn - 8] 
+745 [n—3] 14+ hy [vn - 3] 
The coefficient A,,,-; can easily be expressed in an independent form. 
Putting 1A Sy Sans Os 
we shall have : 
ee CR OE Th, ‘aie SS eee ia (9), 


where the summation extends to all possible positive integer values of 7, ., ... 7, 
satistying the relation: 4,+7%,+...+%,=1, and to all integer values of h,, hy, ... he, 
satisfying the conditions 

DM i eee Ny 


hy thet +... thpy ante. 


Introducing the notation 


7 
inf (@) = VW fw —j)= = Ely CF (@=0) hie eee: (9), 
and noting that 
yh h— 
He =C a 
we find from (5): 


(k) on k—n 


n-1 
? og a A: 
Vv! Al On 
j=9 He k-j 
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When h >0 and 2n—h <k we have: 


2n+ De find \ 
v (hk) en k-n 0 
h 
2Qn wre ie. wo yh —t 
Via a, k—v 7 ner (euon tn) A n,t ~k-2n+h 
2n ae —_ ) « oe EEDA DIC OH OC OOORS 6 10). 
Vin hop = So era 1) ( 
k _— 
Vin ay k—72 Ae eh, 
k+h _ 
Vay Cs 0 ) 
(3) Of the properties of the 8 coefficients 1t is essential to note the following : 
Bx,o =1 
Beg c= Beg =!) Bicone eee eee (HD) 
Bi = Water kes) 
Jah ; 
Bj = >> By Ae ~ bcvscauless ssiiqdeeiiieelaa see eee EER (12), 


where the coefficients B; ; are independent of / and are determined by the relations 
Iii SURE oo (CH) 1) 
Be; = (2) —1 =1) (Bas Baal eee eee (13). 
ipods BrP) son i) 
Hence we have: 
Biol ope e) =) 
Bj=1.3-5..2—-)sy—U 
 QG- i. 15 lle ae 
eC2j 2) eer) oe en tj 25 ss ees 


= 


Or 


& 

©. 
Il 
CSF 209 
Or 


: \* 2(14): 
By, =1:3.5... (3 —5) ete GV — 209 4 4 — 21 + ely — 2 
+4 21 
Byg= 1.3.5... 2) —5) (alls Li BI + 3 + EG 38 | 
+ 197-3) + 8-3) 
From (12) we find, when h > 0, 
Tex) y2j —i—f \ 
Vin By, oe = 2 Bi, ene fe 
jth ae 
Vo Fr,5=9 
Vee h S h-i 
Vi By, 3= Res rely (tt I oB8oDaNDOI O00 0e20 (15). 


tia = B,,=1.3.54.2)-2) 
va Bij = Bi ask 

wR ie 

vi ita =0 ; 
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(4) Further, it is important to note certain relations connecting the coefti- 


cients a and B: 
n—-Ie 
a= Oni ORIG O's menos tase « Da Lay thee: (16), 


O= 


k+m-1 


Ope Rx, THM) isie's\sisielove\sis @lsla\eseiels.« (17), 


On, n—k Brisk m= 


iM 


t= 


where the coefficients R,,,,; are independent of m and are determined by the 
relations : 


Bom, i a Die \ 
Ru, o,: — dlyoe | 
Ry, mo = (2k + 2m — 1) [Ry-a,m,o + Re, tll | Bee (alice): 


Ry, m8 > (2h ap i = dl t) pe nes aE Ti meal 
+ (k + 2m — t) Pieaa micas + Rx, tae a Ty m—1,i—1 


From (18) we find 


apie ig — Wee Oi. (2Si—> L) CF | 
peat ies a ain a er a e(L9)) 
By sta = a -1.8.5... 28-1) {OF + C4} | 
and, in general, Tipp poise oh HH OL a 1 Sean an Se eRe (20), 
j=0 


where the coefficients 7,;,; are independent of k and are determined by the 


relations: 


Ts,h,o = Are h 


h 
= 754,59 = Ber AR 
7=0 
ge WS Veg (St 9 — ht) Te,h-a,5+ (SJ) Te,taja 
Putting 
U 1-j l—j-1 ) 
a : Mer So . ) 
t,. ite ae {Oo 7-3 pte OS ei epm l 8, 2h, 7) 
Pe oe ha ee. (22), 
, Bs < l-j on 
Dies TU anaes Ce Ur 2+, 7 
j=0 
we find further: 
h 
es s-k-l 
Ry, s—k,2h Sah t. ht Cer 
| cee (23), 
h 


ae OS s-k-l ys—k—-I-1 
k, s—k,2h+1 nae or hl [C.on-+2 WF Cay Sa 


Biometrika xu 10 
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where 

tso,0%1.3 

om ee eT 

fa i=1.8.5...(28— 8) (élis= 11 is— l= 

te,2,0= 1.3.5 --.(28—5) {aby [s—2]-9+2,[8—2]- ne J [s—-2]- 45 [s—2]? 

ts 1=1.3.5...(28—5) {o4y [s—2] 1+ sh [s— QI) + 42 [s— 2] 4-4 [s—2]} 

ts00=1.3.5...(2s—5) [,85[s—2]-94+4 [s—2]-14 47 [s—2]-14 4[s - 2-4] 

UPR vi demes eres Seis nee 49 ol WYP ie 

t's, o= 1.81 Dee. (2s — 34a, fs — Ie es = ee Eis soil | 

f’gy1= 1.3.5... (28 —8) {ot [s — 2-41-44 [s — 2] 4 2 fs — 2] | 
| 


A 

Or 
= 

bo 

H 

| 

Et 

— 


\ 
| 
,e 
J 


fien9=1.3.5...(2s—5) (een [is = Slat te [ig = 8 la ig allel 
totes Bj te is— olla eee 45))) 

fo1= 1.8.5 ...(28—5) Geter (8 = BIg [SOI ele! 
+48 [s— 3-94 & [s- 39 | 

(25 — 5) {yds [— 3] + fr[s— 3] + 8h [s—3I9| 
+ 48[s—3]-44[s— 3] 


pope ee 


Or 


From (17) we find, when h > 0, 


vi B ns ya Me+2m—i—-j \ 
(n) a, n—-k “n-k,m 7 Aa k,m,i ~ n—j | 
QBk+2m+h | 
(2) n,n—-k ery as = 0 
7 2K-F 2 —h a B = ae h-i 
-kln-k,m = any t ~ w-2k—Qm 
(n) n,n-k'!“n-k, im Solar SEO ED) k,m,ti ~ w-2k-—2m+h iv ...(26), 


| 


2k+2m ¢ 
7 = 9 Im, — 
vi hs ea Boe m a hk, m,0 7 1 3. 5. (2 k+ ZIM. 1)C Be. 


n) 


| 
= 1 a 2 
(n) Bon n-k [chee m ms R, m, n-2k—-—2m 
nth | 
Me Oo =e Bran m saa 


(5) In Chapter IV we shall have to deal with more complicated expressions 

of type: 
: 

Views oa. Ge ry hy Orgy raahig «2 Orgs ry he rg bret techy Wao he ee 

In my previously quoted paper they are not considered as I met them for the 
first time in connection with the problems considered in the fourth Chapter of the 
present paper. My discussion of these expressions has not so far led me to results 
which may be considered final, and I shall merely indicate the method by which 
their fundamental properties may be established. 

Putting n+r.t+!...¢7,=R, h+h,+...+hp=H, let us replace a and 8 by 
their values from (5) and (12). Noting that, as is well known, 

[et yJoms S Cyi ali yeni, 
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ry Pae—W [r, — hy + 2 — hg to FE H My JOV ON ¢ 


Se Fe Uh gly ES y : 


S - ay ay ep 
=, (2h, — 1,)! (Qh, — 4)! ... (2he— be)! g Gi! Go! -- Gra! (2f— 2H —j—g)! 


vr; [-(@2h,-1,)] i [-a,] ie} ([-—@hy—1,)) ra [- 92] 
Re AO SESE [72 — he] 4, 


we find: ’ 
Tee Bae Wes Ce Crier het Or rene Raa fee | 
3 i ms Cee i Aye pines eon Byun, j - | 
FEO =O pa 0 (2h, — t,)!(2h.—1,)!... (2hy—h) 1 (QF -— 2H—J)! | 
SV 3: Me py @hy I] ply b)) | 

f 

| 


ie —(2hp—y—lp— (9h . — (2hy—lk) —(2f-2H-j- 
k-1 1 [—@he-1—Ue-1)] ([-(9r-1)] Vv" rl (2hx JU aera (2f-2H—-j—g)) 


(rp-1) | k-1 rea o. hy] (1%) ‘ke | 


hy-1 hy-1 hy-1 
SoS 


a 
where S denotes Dee ceo We 
1 TP=00,=0))  We=0 


; 2f—-2H—j 2f-2H—-j—g, 2f-2H-j- Gy = 2-0 Thm 
S denotes > > ee > 
g q=0 92.=9 9k-\=9 


and J=N"+t+G. +--+ Gea- 

If we note that Mies rea [r, — h,]i-1 = 0, when 7, > 2h, -—1,+ 91, ete., we 
see without difficulty that, when 2/< R, the sum we are discussing is equal to 
zero. If 2f=R, then the only non-vanishing term in the sum is the one corre- 
sponding to ;=/,=...==7 =0 and g,=7— 2h, go=Ve— 2ho, »-+; Pea=Th-1— 2hnr, 
2f— 2H —g=7;,— 2h, and the sum reduces to 

Cree Grae dar C,,2h% A}, ,0 AG uae Aliin.0 Bie NAndoUcHobonAdE (28). 


If 2f=R+1, there are three types of non-vanishing terms: (1) terms for 
which 1, =1,=...=),=j=0, and for which, of the quantities 9,, 9, .-. Gra, 
2f — 2H —g, one, e.g. g;, is equal to r;—2h;+1, and each of the rest is equal to 
r—2h; (2) terms, for which 1, =1,=...=],=0, 7=1 and all the quantities 
9: = 7; — 2h;; (8) terms, for which 7=0, one of the quantities /, e.g. J;=1, and 
the other quantities / vanish, g;=7;—2h;+1, and the remaining quantities 
g are each equal to r — 2h. 

Noting that Vr Xl X — hi-# 
becomes r!h (7 — 2h+1) when k=r—2h+41 and X =r, we can without difficulty 
reduce the sum we are considering, for the case 2f= R +1 to: 


21, 12h y2hK / ) : 
Jal 6 (Op {Cre nome Ajo Ap,,o ++» Anzo Bry ; \ 


= ths 0 
2h, py2h, 2hy, i 
MC ae On Any An Ang Bey oo 
=e ) 
2h,-1 py2h, Rh , 
+ On OF nie Ce Anan ges Aap Bru, ‘i (29). 
=e , 


-H, 0 
12h, cy2h, Wry y2hk-1 
ao Ce neice Ci, AtyoAnso ++» Atg-o An Beir 
2 


2h, y2h.-1 y2hh ' 
sae aU Aj, ,o Ani.» Anjo Bras stewart | 
2 


rk-1 "ke H,0 ) 


10—2 
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Consider a variable magnitude X, admitting the values &, &,...& with 
probabilities p,. po, -.. py. Let us make NV experiments, and suppose that the law 
of distribution of the values of the variable remains unaltered, and that the 
separate experiments are independent of one another. Denoting by X; the value 
taken by the variable in the ith experiment and by n; the number of times, out of 
the NV experiments, that the variable X takes the value &;, let 

bi =H | 
B | 
Me 9 SIX Dae Bal pe; | 


fe = hh (Xm y= 2 Oe Sr (E; — m,)" 


Bes bi adaaegeahe (1). 
PE a is | 
SONY eee rae Wi, 9° jet j | 
My, (N) = EX" | 
My, (iN) = E [X wy) ‘a my |" ) 


We have, whatever be the law of distribution of the variable X : 


k k 

> p= = p; =1, 
inal gan 

k 

Sane 

j=1 


Noy, (N) = 14, 
My (N) = bh = 0, 
Mo, (N) = bo = 1. 


We find further, without difficulty : 


by =mM,—- mM; ) 

We = IN; — 3m, my, + 2m 3 | 

bes =m, — 4mm, + Om,m? — 38mi 

Hi. =m, — CO mp m+... +(— 1) C2 m,_4m," +... p++e(2). 


+ (— 1)? 0,72 mom? + (— 1Y (r — 1) mm" 


By, (MN) = Mtr, ON) OA Meyp—y, (N) Wy SOD in (- Tye Ch My—h, (N) mt ar o00 
+ (= 1) Ce me on mi (= 1D GS De 


See =, 
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Conversely, expressing the quantities m, in terms of w,, we find: 


iol \ 
m,= HW {(X — m) +m)" =m" + & Cf my" py + py | 
h=2 

r-1 : | 

My, (Ny) = ™," + > Canta Ln, (nN) + fr, (x) Lo seeeeeeeees (3). 
n=2 
i ; | 
Yh 
D2 (-1) 6 m,-a Mn, (N) = Ge NC ree, Mn, (N) | 
h=0 h=0 sf 
II 
(1) Noting that 
eet r 
Ty 
X" wy — |: x, ) 
Ni a 
; N oe 
we write | > X;| in the form 
of 
ie oe 
Sees Ses Se mr, Xe, 
Gt t 7 Tye Var ee Tj: “ 


fi 
where, as is known, the summation with regard to j extends from 1 to the smaller 
of the two integers r and V; the summation with regard to 1%, ts, ... 4; extends to 
all integral and unequal values of ¢,, ¢:, ... 7) from 1 to N, and the last summation 
extends to all positive integer or zero values of 7, 7%, ... 7; satisfying the relation 
Tata... j= 7. - 


Passing to mathematical expectations we find hence 


f 1 r! a 
4 SUAS) *. wr - ; 
My, (N) = Bas Se SS —— -, EX Paz EX ee EX if 
gi% Tye Pores. Tj: 
ai 5] 
ts N s Te: 


+ My Myy vv Mygs 


ANG eesti ony Tyre! sa 74! 


where the summation with regard to 7 extends to all positive integer values from 
1 to the smaller of the two numbers + and JN’, while the second sum extends to 


all positive integer values of 7, 7, ...7;, satisfying the relation : 


Uaeeisd (py mean, att 
If r < N, we have consequently : 
ie : 
My, (N) ~ Wi 2; R,,; NA efeistetisXefehagerene’ oe vetetaietelehs Sra eleliohe! wseyets (4), 
where the R,.; coefficients are independent of NV and are defined by the relation: 


| 
eae ig i se r! 
Me sige} any amelie By erate tl 


My, Mpg ve+ My, 


and the summation extends over the ranges specified above. 
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Hence we find: 


= pl-(ht0)] ae 1 
TR ij Se ape ara = My Mig +++ Mags 


> ste 
a Nyt ho... hy! 


where the summation with regard to 7 extends to all integral values from 1 to the 
smaller of the two numbers h and 7 —/h, and the second summation extends to all 
positive integer values not less than 2 of hy, ho, ... h;, satisfying the condition : 


hi thot... th =ht+it, 


: ; Mr Mpg)? 0. My, J 
or R, <= Spl) mh Sy ees a ee is ob. ee id (5), 
— a — aay 
F Dl Jot vee Jp! [Aa YP [he tp... [ay er 


where the sammation for 7 extends to all integer values from 1 to the smaller of 
the numbers h and *—h,and the second summation extends to all positive integer 
values of j,, jo, ... Jr, Satisfying the equation : 


tpt oe tj =t, 
and to all positive integer values of fy, hy, ... hy, satisfying the conditions: 
Di SRyiltg ene <A 
hi thsjot ... thyjypHhte. 


We find hence: 


Lop te 
Tipe — Op ee es 


Rep g=1.3.5 C8 m7? m+ 100m msm, + CA mia im, 


Ry pig = 1.3.3.7 C2 m7 ms -- 105: Cy an m: 
2 1 2 3 


’ 


) 
| 

Up peg — Led G9 naam Tida tle arora IDs | 26) 
| 
| 
| 


+ C8 m,"* [15m.m, + 10m,"] + C2 m,"—* m; } 


and on the other hand 


Jaye rent ih 

1 TH=1— _——_——  Fiiraiielokein: slieteie ove) sieiainle\eleinze eleieleterers Gi); 
R,,» - 125 ~ Ch iy, My—h 

+ 4h=1 


and so on. 


(2) The calculation of the coefficients R,,,-, may also be effected by another 
method. 


: N r+ N a 
Noting that E | 3 X,] =WEIX, E x,| , 
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we find 
N r r N Th 

re aire = E |X 2 X,| mi = E 1X3 Oh xy | = x,| | 

| 


1 <1 
= Tic {mess + + Ce myn (N — 1)" ma, wey + (VY — 1)" mm, wa] MS ods) 
h=1 


( Z 
=i {firs + 2 Ot (N — 1)! my 43-1 Mn; woof | 
h=1 


) 
Substituting here (see (4) 
: h 
(MN — 1)" ma, way = > [VW -— 1}! Ri,,j, 
jal 
we find: 
Mysa (N) = Ze late Tess Cr Lert > SL 1] Rk, | 
, (2 WN ( 2 r+1—h bd 
= ae {Nites a S NON ag Ce My+i—h jana 2 
N j=l h=j ,) 
On the other hand 
1 i: 
My4i, (N) = New = Nie isp, i 
Hence 
Ry = M41 
zs 
ayes = > OHE My+i—h 1G Fa er eretsiets) oh oiesevelerehsvelsie's'erevarsie's (9). 
h=i-1 
Ryans = My, R,, aaa m" | 
(3) Putting (see Introduction (2)) 
Ape 
—j] — SS (S 1 Bir Ni-S, 
=o 
we find from oe 
My, (NY) = 4 SS Ee 1) Ni BR, | 
r, (N) = 2 ve é 05 "9 | 
1 ere ere | 
Bieng 1) Bisa B | 
; FSi i : r feel (10) 
=m"+ > Wi,~ S (= 1)" Brien, br pith | 
i=1 h=0 
Ne + ay OF ma ba + ap eS Comey a pe tC? me wl + | 
When r = 2, 3, 4 we find: 
1 \ 
Me, (N) = M2? + W He | 
ne | 
Ms, (N) =," oy My plo + = bs meer ((lule)e 


—————— 


6 1 s 
My, wy) = Mt + NV My? fly + a [4an, ps + 3uo*] + ye L# [es — 3.7] 
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Il 
(1) Ifm,=0, then 
= Hh (X —m) = EX" =m, 
and fer, (N) = My, (WN) 
Putting m, = 0 in the formulae of § IT, we may, consequently, replace in them 


the quantities m by the corresponding quantities w. But when m=0, R,,,, = 9. 
ifh<r/2; but ifh>r/2, then R,,,_, reduces (cf. (5)) to 


ng? bad? ..- png 
at Gal ce yp) eae syle eee 


where the summation extends to all positive integer values of j,, js, ... Jy, Satis- 
fying the condition j,+j.+...+jp=7—h, and to all positive integer values of 
hy, he, ... hy, satisfying the conditions : 


1 eas a >> 


DiI <a les 


hij, ar ho jo+ soo Ar hyjp= GPs 


Hence 
1 op FAs = 
ar, LN) a Ne 2 NTE Oe ots We, = ‘Ne iit or,r—h 
h=r 
(13), 
1 x MN mip 1 Si NV WT. 
es eee —r+i—h)] Nev 
2r- (N Top 2r+1, 2r-+1— = = 
r+1, (NV) New A vee ar-+1,2r+1—h = Ne rH art+1,r—h 
or 
Ent. (F)-1 
i 2 {—[Bnt.() = i |} 
Py, (NY) a > N 2 a fis : IV =, dondonbsuo05R0DG00800000000 (14) 
Nt an yr, Ent. (5)-n ? 
where Ent. (5) denotes the greatest integer in 5 
If we now put 
r-h—-1 
Niet hy = > (— iby eens Nr-h-t 
a=0) 
we find after some transformations: 
r-1 1 a 
Me, (N) = > Neti, x (- LN [Sr ith,h Msc r—ith 
i=0 h=0 15 
ae ay i Ru tpinonnnncdio: (15), 
Har+1, (N) = = Netti >> (- Ne Braecnen ee aren 
i=0 nh=0 
or ae (5)-1 1 
br, (Nn) = >, ae AN 3 (— ie B r JL TNS se LG) 


Ent. (5)- ith,h~ 7, Ent. (5) -ith 


*)+in=o 


= yEnt. 
se! 
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When » is even, 
Met os (at— 1) pe", 


time Mr! Mngl? «++ bn jy 
Ons, — S —— f 
eeir-h— (27)! WHA (p—h— a) RS jo! eG! Lhea Y [Ae ts... [hag tI] % (17), 


yes ee 
B35... (rT) Sri rol gett yay a PY , 
i \ Jilja! «Jp! [hat [hel] ... [hp te 
where the summation for ¢ extends to all integer values from 1 to the smaller of 
the numbers 2h and r—h, and the last summation to all positive integer values 
of j,, Jo, «+» jp, Satisfying the condition : 


Atjo te typ =t 
and to all positive integer values of h,, ho, ... hy, satisfying the conditions : 
ISI allen <n igs 
hij thojot ... thyjp= 2h + 21. 
When r is odd, 


~ Gore \ 

Popsayr = 1.8.5... (2r +1) 3 pal Ms | 

r—h-i J yy Fe if | 

2 bry! Bhs? +++ Bhp | 

[cna pj SCONE SLO ea ne i A 7 
Bena ert 1)! 3 Qr-h-i (rp — hh —1)) Gl jo! Jp! [a] [hol ... [hel be2 


zs ; ; 4 Kn jy Hing?? .+ bn Jy 
=1.3.5...(2r 41) 5 rl tal geti yr hi Sy _ si : f = 
iaa)2 et julje! Jp! Ua! [help ... [hy 


where the summation for 7 extends to all integer values from 1 to the smaller of 
the numbers 2h +1 and »—/, and the last summation extends to all positive 
integer values of j,, jo, ...j, satisfying the condition : 


A tjet oe tp =t 
and to all integer values of h,, h., ... hy, satisfying the conditions: 
I tenet aap, 
yj + logo + .0. + hp jy = 2h 4+ 204 1. 


P ° 
The coefficients 7’ may also be calculated by means of recurrence formulae. 
Putting m, = 0 in (8) and replacing the m’s by p’s, we find : 


1 (Mines | 
My4i, (NY) = Ne |e + 2 Coien (CN) — A)! pi, wh 5 
and hence (or directly from (9)) 

Dees = Pr+i 


r-1 
TiN, se ORE Teta as ad Mia 
h=i-1 
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We find in this way: 
(ae =e Op bey 
Tory = 1.8.5... (2 —1) {hr po? py + br poe” pe} 


...(20), 
{ior Wan) aera Cpa |) {hy Tp” pag + zig 8 pel pe? 
tg) pal pastes + gig PO pol? oe? os + gee Mel pHs'] 
Tet sy a Wear iny Seed PAE el Ere yreves Ve, \ 
eae oe = 1 . 3 ° 5 eee (2r+ 1) {yr [-2] pa” =H Bs + {grt os! hs fs | 
=e pl-4l r—4 
, tai eae ws | (21). 
Derr s = 1 . 3 3 5 siete (2r + 1p) {z Gi 1 50 1 [3] Me a 3 wy + girl Loi 3 Fe 
= tr jig babs ar 1857 Ab be deat bes Bs + Tos 7) per”? fska | 
+ Pr) pal ps! plat paige’ pa} ) 
Hence: 
) 
pr) 21 BB one B= 1) fe a | 
i Paves eee So wae 
li Nom [dat 2d peg"? pg + ort pel ps? — trl] pe") | 
1 san 7) 


a Nr Bort py! pg +7 Le ol—4) pol —4 fg fs + Sie ye rn 


r—1 = 

ae ari (ees 98: [3 ) pb” 2 aha? [6] [ORES jist 
(P=) 2) a or tle ) 
Se Se ne ne 1-3) wt | ee? 
18 2 pal? pe + 24. 7 Pe j 


) 
\ 


1 


+ 4 past ps! = Gr pa | | 


1 
Morti(ny = 1.3.5... 1 2r+1) apes BP pa”? ps 


1 
i Nr pone eer) este afta fy 


1 eee = ee , 
= Nt star Pa! py + gta 1 po” pate + THO 79 pos! fafls > (23). 
(PII 


try IO al ast os — EG Ma st be TT Mal Hobe 
Behe ere 
te st yl—6] a Ls fs — (7 Ae Be 3] 1 ita 3 Us sMatrs ghar nat 7 pes 
r—1 —2 —4 r r 1 8 | 
¥i ea ) a fis at es te a 8 mee 
y) 
On the other hand, 
dle 1 Poy 
y-1 5 ...(24), 
(ieee) +% Bh Merl h BS (ole 7 ee (Oe 2 
ar, 2 = (3 2r)! Q tr op = hi(2r =)! pmo or Mh Por—h an—1 Mi 


ne = Portia 


, 
h 

N 

flies a >» Cr, Ph Per+i-h 


h=2 
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For r = 2, 3, 4, 5, 6, 7, 8 we find hence: 


Il 
M2, (N) = 7 Be 


1 
B8,(N) = Hypo Ms 

: [21 3 1 : 
aw) = 77a Ns + BNE pst] = a9 Ma? + are [Hts — Bye] 

[2] 10 
Hs, (y) = Fy Nes + LON pos bo} = 73 Mae + wall — 10 p13 2] 


Me, (vy = a (Ne + I [15 pts flo + 10 p47] + 15 NO p,"} 


ey a : i Pet, 

mere eye Lateetts © tts — Spee | + ap lite atts — TOS 20s] 
1 - Y 9 

M1, = WW {Ne + NON QT ps be + 35 ps Ma] + LOB NT poo") 


_ 105 a i : ‘ 
= We Hs be + Ne [Bhs Me + O fly, — 63s 1427] 


1 
+ ape Loe — 21 soe — B85 pfs + 210jo4]} 


Bs, (N) = = (Wg + NO [28 pHe + 5645 Me + 35p,"] 
+ NI) [2104 p.? + 280pu.2 42] + 105N I p,4} 


105 70 s z 
a Ge Pa! + N? ys T As? be a Dpis4] 


ae 4 [pls flo + Sps ply + Spey? — VO py pe? — 1203" po + 165 p0*] 


1 
on N? [os — 2805p. —56u5m;— 85 pu2+ 420 spo? + 5605? He — 63004] J 


(2) Noting that 


1 N 2 1 ( hy N a 
Xw)- Mm == & (Xi-—m) =—1 sd (X;-—m)+ =F (Xi-—m,) 
N y24 N ee t=h+1 


= e {a (Xa — m]+(N —h) [X~w-n)— ml}, 


we find 


[Aww — mM} = re, Che (N hy LX ay — may LX we ny — ma], 


br,(N) = = = OF ht * (N — hy)! pra, ay Mir—ny, 


Tr—2 
NV? by.) =" wr, wy + me CR poy, ay (NV — bh)! pi, ey + (WN — BY" per ny: 
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Putting h=1, we obtain 


r—2 
NV" py, (yy = br + = CN — 1)! prs Be, ay + CY = 1) rep 
Hence: ; 


ets 
NN” pry — (N= 1)" pe zy = Mr + = CN — 1)! ps oe avy, 
jay 


Ne 
and Nt uy ~wy = Nyy = S 2 Us tes (v ee Dy UM icge ysesnbcecso: (OA EN. 


j= 


If we give r in turns the values 2, 3, 4, ..., we find the relations (26). 


IV 


(1) From the relations (15) and el we find: 


we a0 T 
fer (N) (wee fy 1 2r,r—ith 
are Ie wel (es if i) ( +3 No z (- ye Bree h 123N5" .(2r sz 1) ps! 
As N increases the ratio /” “ consequently tends to the limit 1.3.5... (2r—1), if 
Me” (N) 
ae 1 a h 
> ==) Beane arjr—ith 
= aS B p B BEL ORS 


tends to zero. 
But if this last expression tends to a limit different from zero as NV increases, 


then the limit to which pene tends cannot be equal to 1.3.5... (27 — 1). 
Bez ,() 
! 
..(2r —1) 
any value of r and is eee, both of the value of V and of the law of dis- 


The quantit 375 8,-i+n,, does not become infinitely great for 
| Y es yg 


tribution of values of X. In order that pag should tend to 1.3.5...(2r—1) 


D) 


it appears then to be a sufficient condition that T,»-i+n Should tend 


to zero for 7=1, 2,3,...r—1. A sufficient condition for this, in its turn, is that 
expressions of type 5 eee 
Miy! Mhg)? 00+ Mngt 
Ni eae 


should tend to zero, when the quantities j,, j.,...jr, are connected by the relation: 
Jtjot-.. +=, and the quantities h,, hy,... hy satisfy the relation : 


hy jy + hojot . thyjp=2i-h +I), 


and / can take all integral values between 1 and 2(¢—h). Finally this condition 
is satisfied, if expressions of type 
Nu; 
[Wal 
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tend to zero, as N increases, when 7 = 3, 4, 5,..., 27 —1, 27. Noting that when 
_Mar— 


[usanl™ 
the well-known result (ef. A. A. Markoff, Theory of Probability, 3rd Edition, 


pp. 329-330). 
The probability of the fulfilment of the inequality 


1 Tet 
<= E >) (X; a) < ty 


2 ING Si 
Wye 


rt, ; 
tends with increasing NV to the limit au | e dt, whatever be the law of dis- 
Tt, 


these conditions are satisfied the fraction ; tends to zero, we arrive at 


tribution of values of the variable X, if only it satisfies the condition, that 

Ny; si 
[Mp Ya should tend with increasing NV to zero, when 1=8, 4, 5,... 0, and if at 
the same time the law of distribution of values remains unaltered, and the separate 
experiments are mutually independent*. 


(2) From (22) we find: 


2r,(N) TMi) ihe 4 CG. = GC — 2) ps | 
= , = _ 1 — ae cree 
139 ... 2n — 1) ps" ww) : ral 2 ee ua 9 eh 
Thus we see that ees) °) tends the more slowly to 1. 1.(2r—1), the more 
2 ,(N) 


the law of distribution of values deviates from the Gauss-Laplace law, and the 
greater ris. If uw, >3y,, then for sufficiently large values of J, 


EEN) Heda bia (om 1); 
fs" (N) 


; N ' ; ae 
* The condition “ - Tua tends, with increasing N, to zero for i=3, 4, 5,... ©,” while sufficient, 


is, as is well known, not necessary. From the form to which Liapounoff succeeded in reducing the 
condition (see Proc. Imp. Acad. Sci. vu series, vol. x1) it follows, among other consequences, that 


g : : anal 
the law of distribution of values of X(y) tends with increasing N to the Gaussian, if N i tends to 


zero. Noting that 
en =; [4-3], 
2) 


we see that this condition is satisfied if “an tends with increasing N to 3. It is in this way that 
M2” (N) 


Liapounoff’s results justify arguments based on the examination of the two moments only uz) and 
#4(y) in deciding the question, whether the law of distribution of the values of X(y) tends to the - 
Gaussian with increasing N or not. The assumption usually underlying such a procedure, viz. that 


2 
the law of distribution is the Gauss-Laplace law, if is =0 and ES. is clearly inexact: the coincidence 
2" Me" 


of the values of two or even more moments does not guarantee the identity of the laws, but merely 
compresses the possible divergence between them into limits which become narrower as the number of 
coinciding moments increases (cf. the investigation by Chebysheff ‘‘On the Integral..., forming Approxi- 
mations to the Value of an Integral” in Ocuvres, t. 11, and the related papers by A. A. Markoff ; 
cf. also T. T. Stieltjes, ‘“‘ Recherches sur les fractions continues” in Annales de la faculté de Toulouse 
(1894). 
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V 


Equations (15), (16), (17) and (18) hold for all laws of distribution of the 
variable X. If the law fulfils the conditions 


fis 108 LO 10) leeway, 


then poyi3(vy = 9, for all the coefficients 7,,4;,,., then vanish. As regards the 
coefficients 7',,.,-,, they take the form: 


h 
Ton ph = V.B.5 ee (Qr—1) EY rita Qeté yr hi 
' j=1 
Wass Moh; Pong! tee Poon lf 
Dil Jo! --- Je! [(2hy) ff [(2he) 2... [(2hg) ]2r 
where the last summation extends to all positive integer values j,, jz, ... jy, Satis- 


fying the condition: 7; +j. +... +jp=%, and to all integer values of hy, he, ... hy, 
satisfying the conditions : 


2 Sliy align alles 
ijt hojot no thyjp = hte 
If the law of distribution of values of X also satisfies the conditions: 
pg = 1.3.5... (26-1) gy’ for ¢=1, 2, 3,...7, 
then (cf. Introduction (5), (8) and (16)) 


h 
Pay rh = 1.3.5... (2r—1) B WH BME ar x 
i= 


[1-8-5 Qh — DIF (1.8.5 Qh, DM [1 See Cp ag 
Dil ja! oo Jp! [(2Ax) !]4 [(2ke) 1]... [(2h-) Pr 


h 1 
=U35 (Gr Dw DE 
( ee Fil jal oo jpllPal]* [hal]... [ple 


ili3 he pd unas 


and 
eae eee el Ee 
Mar, (N)= 1 . 3 MO! cies (27 — Dias 2 NEA > (- 1) ay y—i+h rocentn 
=a 
t= v=0 
=1 3 = Dy 1 r 1 q + r 
tal era ee een Ye ll) We 1-8-5 rier Lies hin. 


Thus in the case when the law of distribution of the values of X for i<r 
gives values of «; answering the Gauss-Laplace law, then the values Xj, follow 
a law of distribution giving for wi.) for i<r the same values as the law of Gauss. 
Consequently if the law of distribution of values of X is Gaussian, then the law of 
distribution of values of X (y) is Gaussian also for all values of WV. 
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CHAPTER II 


i 


; lige Nees: , 
(1) Putting es oak OG att ON et) Th a a (1), 


N v= 
we have: Bia 
Noting that y’, is the arithmetic mean of the mutually independent quantities 


(X,-—™m,)", (X,—m,)", ... (Xy— im)’, 


and that EH (X,-— mm) = py, 
while XG — 1) = fiz; 


we find from formulae (4) and (5) of Chapter I, when we put 7 =m, and replace 
m,,(v) by By’, and my, by par: 


1 m 1 m-1 
Epi" aa Nm > Rim,m NEMS Nm > inne Li eh 
n=l h=0 9 
iene ee. (2), 
a Ne = N’ 2 (= UD aie Siemens eae 
= 1=0 
where 
h j E if 
Bim, ae NS mle eto pn—h-i S) aa (ay 1 ata ae eae if (3), 
i=1 Jil jo! +s dp! [hal [hol] ... [Ag] 


where the last summation extends to all positive integer values of j,, jo, ... jp 
satisfying the condition: j, +jo+... +. jr =, and to all integer values of h,, hy, ... hy, 
satisfying the conditions : 
BN Nats lbp, 
lijithejot oo thyjp=h+i. 
Hence : 


me 


Rim, mi = Py 


———- 


Rim, m—1) — (Or Bn? Mor 
Leena =ai ocr, [i boy te (Ope Lae sr (4), 
eens = 15,8 [oy cme peor? oF 10¢7 [pg Moy M3. + Og [ie May 


Rim, m—4] = 105 Cre fy ploy + 105C,,’ fy boy” May + LOC; pee Mer bar 
a 10g; Fp Ss bay te Cre fy bsp 


and, on the other hand 


Ron =. mr \ 
m-1 | 
Rem, 21 aa 4 > Cra Hhr F (m—h) + \ 
ia MMs oases 0) De (5). 
m-1 


Rimsii a, > (Oh a M(m—hi Bia 
1 


h=i- 
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Substituting in (2), we find: 


; Lan ae es 
Ew es rial a ie D epee [ Moy — feel 


1 {mi ml] 


4] ‘ 
a ie 1 a fy—4 [a= jee? ns cE fiyt3 Wiese = Ofloy by + 2 +... (6). 


(2) Hp’, may also be calculated by means of recurrence formulae. Put 


EN” w= E | = > (X;- my |" = BA Fe Pn sac cose (7). 


i= 


From relation (8) of the first chapter we find: 
(m) i yt ) : 
Ne WW) Shy {fae qF aC 1 Him—h) r¥ Oe Sietalevatetnelereectetteterenere (8). 
Noting that ae = Ny,, we find hence : 
¢ a = Nite + NIA P= NV [par — po? | + NV? pe? 


» (NV) 


oe vy) N fis, + NI) 3 poy by + ayia! by? Soon) 


4 


2 2 — 2 = 
o, (vy) = Nyy + NO [bps py + 8 ploy? | + NO! 6 poy by? + Jal ry 0 


Mb 
and, in general, yh? S93 NAD se eee (10), 
15 1CN)- rt 
where 
(m) \ 
Dy 18 Son Linr | 
(m)  _se am | 
mo Mr | 
Sl 
(m) mM h (h) 
Dy tan > CG. kinane 2, Asal | (11) 
4 h=t-1 ; 
(m) (m—2) (2 - =) aes m—s (m—1) | 
= _ a m—1 
Bate (m 1) for D iB ees by Dn =(( ) Pr br ae Ds | 
i 
= Cn? Por by | 
2-3 F 

10¢=2 bed : v (ole aL 3c" 2 LL fee | 
ee ae Mer on vate b! | pms v ny Mor } 


and so on. 
Substituting in (10), we obtain 
y ™ = NE u ym = NI-™] poy + NE mv) (Ghee pear Teg 


DN) 


m—-s 


+ Nim jie 3 = Ge 1 _j Br! I thim—s 3) y+ 30,3 Pay pay ; Po ri (Ihe) 


. 
— 


m—-1 
: yh 
ar NES D2, Cae: Mim—nh) rv Par aP IMT 
n= 


Au. A. TcHOUPROFF 161 
(3) When r= 2, in the case when the law of distribution is Gaussian, we find 
from (8): 


me 
(C2), Ve « = m S fh Paty Pee m—h y ik 
oy) =k .9-0...(2m—1) fo a me eel ..(2m —2h—1)p, ,(v=nf* 


Hence noting that yo ae ay (NV —1) ps, we find: 


yp Na Nise 2 
Uo (Y) N (N + 2) pe’, 


1 y= MT +N + #) wh 


Oey = NN + 2) (WV + 4) CY + 6) pet 


Let us assume that for all values 7 < m, 


y® -=N(N+2)(N 44)... (NW + 20= 2) pol. 


2 y N) 
In this case : 


(m+1) __ m+1 9 > i 
oe = Np) {1.3.5 ... (2m+ 1) 


163 Oy. 8.5 oo. (2m = BM #1)(N =I) M4 1). (W +2), 


h=1 
(m) 
ie {1. 3.5... (2m —1) 
Sl 1.8.5 m= Bh = UV = IW 1). V+ 2h 3h, 
ee m— 


and, consequently, 


(N+ 2m) p00", = Nuge 3.5...(2m —1)(2m+ 1) +(N—-1)1.3.5...(2m—1) 
m—1 

15m Oly 1.8.5... (2m — 2h — 1) (2m — 2h + 1)(N-1)(V 41)... (NV + 2h —38) 
h=1 m— 
m— = 


= 1.3.5... (2m —2h—1)(N —-1)(N +1)... (N+ 2h- vy} 


m—1 


Bis. Sonal) 


+2 O,71.3.5...(2m—2h41)(N—1)(N41)...(N + 2h —3)| 
ho 


(n+) 
2,(N)" 


Thus when the law of distribution of values of X is Gaussian, 


Lt Ny ya NV +2)(N +4)... (N+ 2m — 2). for m= 0,1, 2,3,... 20 (18). 
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(4) 


When 7 = 3, in the case of a Gaussian distribution of the values of X, 


Sul 
Coe ck mad ae 
Ys cy =A ee pitas Cy Hs (mh) "3, (N=1){ * 


Noting that WN) = 0, if the distribution of the values of X be Gaussian, 
we tind: 


Vs) = NGS Ie Bie Win Oe 
Sea ae = bak 8 Penne rt(2d toys 
Ys «vy = 9 and, in general, UN eae 0, 
(4) tf 3 9 
U3 y= Nue1.3.5.9[5N +72), 
(6) 


OF ay = Nu? 3.5. 9.15 [252 + 1080N + 15912], 


When r=4, for the case of a Gaussian distribution of the values of X, we 
have: 


(in) ‘ Te Pe AL 2(m—h) (A) 
U4 aN) =N{1.3.5...4m—1) 43 Bees (Ole Lska isa) Shel Gia 0=— IL), COWEN 
() 


Y4,v-1) (VN —1) 4, =3 (NV —- 1) pe’, 


Oem = Nut 3[3N +32], 


(3) ne = Sar 
Vey = Nu 27 [N° + 32N + 352), 


iG 


Replacing r by m and py) by H(w,— p,)™ in formulae (22) and (28) of 


h 
Chapter IJ, and replacing w;, by } (— 1)* Oh" w* wu, and putting 
) 


r= 
h 
= ia Ly ChE joe KMih-k) r = v.G%, 
k=0 
we find: 
ver ; one ole as : \ 
Ep = = 1.8 De ante XG 


Pia 2 


i a E Bape eS 5 
at Nm (Sank Xa 2 Xt nl XX — pale Xe | 


1 Spey cies Se : : , 
+ Vine A (ee Gime, Gee COPA GH Ce ¢ seal isl ibe ACP 


. eee ar ante tL sip eee ae Poa wl 
+ Ami Xm XX, — aa Xi" X,+qheml-1 Xm" X,! 


ae =) G a \ 
= vee zy ml-3| Xia aes ae a ; : mi) xs] aF oe | 


24 
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1 DY a Y fe 
E (ey — py PPO = 1.3.5... (2m $1) aa lm Xn X, 
=e ee EF (Se. Gi Gas thom Gey OD. 
+ gy ml- aXe Xe — mi 8 Xe | 


1 Fe 
+ GEE gag XX, + gto m4 A XX, 
Nuss 
+ hqml-4 Xm *§ XX, > (15). 


ai. ; Fi 
Wl XP gt em Xo XX 


“ai 
(m—1)(m — 2) 


ap rae m6) MME xX; .o 36 SSeS A) Gee AG AG, 


fe gig Xs m—5 XN. oxi 


a aagp me! {-7] AG m—7 Xs 


(m—1)(m—2)(m—4) [-3] Yim: m(3m—-1) yma y 
. 162 le eee ecerny =F tof 


Hence: 
Bgl pn) = [i — fu 
N 
B aoe. ? P : 
E (uw are fr) = NE [ds> ie D ploy by a Z| (16). 


Di 3 16 « QB © ; 2 . 
EB (mw r— Br = NW? [ier — pe P ats a Aut, [Ly a Dfly)" al 2 bay fly? =: 6u,*| 


III 
(1) Noting that 


N N 
/ 1p 5) ¢ - - SES. ve s . , 0 
Eby by = ae 2 (a — my) + pet Ct my 


A 1 
2 Noir, ee ee! Bry bry} = Bry ry, + N [Misys ~ Mrs), 


we find: 


77 , / Pi / ec ] 
FE (Wor, = Mary) (Mir Brg) = ey Miry = Bey Pr = [ ryt, — fry bere |: 


Similarly we find: 


ae ens 1 es 
Kp ry Big Borg = N3 Vir tretry sae [Mrptrs fers © Mrytrs Pry + Mret+rs br,] 
; + NI) py, pry Mag} 
s Bry My Br AP N eas Kis ar Pry+r3 bry a Pry+r3 Bry aa Buy, Mis Hrs] 


it 
ar N?2 [renters po (Hrytr, Mrs ats Bry +73 Mi» ate Miytrs My) ar 2 fy, Mis ir, |, 


11—2 
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and 
iE (u,, a Hr,) (ax a: ry) (ir, ee Hrs) 
1 6 
= ne [aaeiar — Prytr, Pry > Brytr3 Bry BPretrs Bry es 2 py, Pr, fr, |- 


1 
if y , ia (7 
Ku big bry Kiry = N: ON Tis te 8 2 te 


a Vig [Mrytretrs Pory O Prytrstry Mrs + Mrytrytra Bre + Pretistrs bry] 
eae [Moy try Pistr4 a Brytr3 Protra ae Britis Porytrs] 
a Ni [eee Br Bi, ate Mrytrs Bre Bry ar BMry+ry Bry brs ar Brytrg Bry Bry 


at Prytiy Bry Mrs aE Brs+14 Bry Pre | 
tN Bry Pry brs Pry} 


I 
= ry Pos Pag Brg TH [ors ry Mors Mory F Por try Mery Mir F Pry try Pre Pers F Mary try Moy Pry 
N 


ar Bretry bry Bry + Prstry Pry Pry|| 
1 
ie N?2 [Mrptretrs Mary F Pry prety Mir © Marytrstrg Pry + Mrotrstrg Mry 1 Prytre Mrstra 
FH Mrytrs Mrytry Herp try Mrytrs — 3 (Mr +1, Mrs Mary Prytiy Pry Mry © Mrytry Pre Pry 


cs Brotrs Bry bry als Mrstry Bry Pry tr Brstry Pry Pry) a 1p, Bry brs bry] 
1 : 
a N3 [erytretrstrs ae (Mrytretrs Bry aE Pry tretry brg ar Piytigtra Pre ar Protrgt+ry bry 


ar Biy+r, Migtiy ae Mryt13 Mrotig ar Brytrs bre+ts) ar 2 (Hees brs byy =P biry+rg Pr bry 


“Te Miytry Bry Hrs ah Piy+r3 Boy Bry Si Mry+tirg bry bys sl Mrstiy br, ber) a 6, bry Mrs br, |, 


and 
E(w’, ae My) (ie a Mry) (ies sae fers) (M4 = (Pp) 
ie | 

= WN? (Mi +19 rstis oe Piy+rs Mrstig ae Prytra Hratrs) ra (Miry try bys Big te brytrs Br, brs 

b prs try Py Baty + Para tity bry Pry F Mrgtry bry Marg + Merge Pry Pers) + Oly Pry Para era 
U f 
a5 WN3 Mr tirgtrytra — (Porytre tis Brg FH My tretre Mig tb Mrytrytry Brg + Mrotratre br,) 
ra (Lt, Mrgtiy + Miytrs Mrytry + Mrytrg fre-trs) 
+2 (Mr try Perg Marg F Mrytrs Pry Mry 1 Mrytry Bry Mary 1 retry Bry Pry 1 Protry Pry Mrs 
th fbrgctrg Mery Pry) — Spr, tery Pry bora} 
The coefficient of 1/N? in 


EE (ny = fry) (My = rg) (rg = Hg) (He rg = Ha) 
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can be expressed in the form 
[ery try — Mary Mery] [Magny — Pry Pra) + [Mrytrg — Mary Mag] [retry — fry Mery] 
+ [Mrptry — Mery Hrg] (Hratrs — Bra Brg | 
= [E (wr, — br.) (Hg — Pre) ] « LE (H'rg = ry) (eng — Hr) | 
+B (wn, = Mey) (Mig — Mrs) LE (Hong = Mg) (Hg = Mey) 
LE (i , = br,) (Hirg — Hr) LE ng — bre) (lrg — pers) |*- 


(2) In the general case, putting 


and agreeing to denote 


t—-h+1 ¢-h+2 i-h+3 G 
> > >) Aes > bv. Diss 
f,=1 Srx=f\t1 Ss=f.+1 Sh=Sfn- +1 Wie (A) 
we find: 
y) Is heals ey () 
EM ine el NEN | a 
N aa “7 Nt j=0 1, 4-9 i,t 
inl J 
e. : () 
qr OS, agen Gall) ota a > nian 
g=1 NI p= eed a BT eh 
(2) = 
Here, ee = TT, 
and ie may, when 27 <7, be written in the form 
Pape es eee, 
so =F. G4D) Brg Mery oes Prp 
and, when 2) > 7, in the form 
ixp BG ee 
i Sou LK Ww 


r(?) 
where Kk ji Prptrpt trp 


and fie denotes the sum of all products of / factors of type pa, Ming «+ May 
possible, subject to the condition that /,, h., ... h; appear in their correct order, in 
sums of not less than two terms taken from the numbers 7y,, 7y,,... 77,,,, and that 
the number of summands composing /;, is not greater than the number composing 
hys,. We have in this manner: 


. V1 7 P 
qH® Ss sy ; Ti 
ioe me eae nae ea yy 
i= ao — aa flare Myre 
A 7-2 =i-1 i - 
He = v Ss He 
5, G0 Rea ca ere are ha hut ge 
Ai=1 fo=f,tl1 S3=fo+1 < [iy feed edsf 


eee SD [Me sire berg tire bh Marg trp, Mery pr 
cine ee a POSEN: ee UAE Ske TARE 


+ lod ] a 
ne ty "re +r p,- ke , 
0 eat eae Se erp, Borg, Borg, Mop, 


* Cf. Soper, “‘On the Probable Error of the Correlation Coefficient to a Second Approximation,” 
p. 97 (Biometrika, Vol. rx). 
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@ IT 
fe j zs = Pope me gk LL trp +77 +47 
OS.) big, Her, Hong Bing, 8 Is 
Ny 
2 z [Mr trp, Bry trp te a+ ey trp, Mr ptr g trp 1 yee 
FON re a nn EERIE (6 Bi FY 
ae Sa ly ; =| H : 
My p try, Hep try, bry +rp, Py pt pe Tape Mrp trp, Saal 


SI, (6) Bre, eae Mrz. a (ips bre, 
and, on the other hand 


@) 
Js 1 Brytret.. try = bre 


(21) 2 wed 2% 
21) aac = 
Fy y= prs Prong berg try, Mrmtny tng + oe 
; j=l A=l fo=ji+l o a 
Qi-k+1 2i-—k4+2 2i 
a 2 x meee > Mog tng tetry, Mralrj try be try) be 
- ee y . 1 ae ho Jk Jape? Jk 
A=1 p=fitl Ik=JIk—\ +1 
i+1 7+2 22 
Ss Ss S: 
an Ee aes a SMG ATG HANG; Br—lr5.t9j, +4754) 
A=1 jfo=fjtl Wi=hi-1 +t 
. +1 Qi B41 
(2¢+1) s S Ss 
Ay y= > Pep brarg i 2s SH ce seri 
j=l A=) h=it “s 
i+2 i+8 21 
ee = aut ee Mrgciig eet g eta lngeh Macher tha le 
A= do jl Ra=hi-1+1 : ae ; ‘i 
F A 28 (2) 
In the case in which 7,=7,=7,=...=7;, the coefficients fel ‘ become Ry, 


and we get again formulae (2), (4) and (5) of the present chapter. 
(3) Let us agree to denote 
Bi; — ie) (Hie, Se) ee are 
We have: 
Ey dp=EV')—-— = Hes, F(T oli fry Hey, 2 (Uh eng 


j. J, (2) 
(= 1) 5 = Marg erg. = Beary, bh (WL [ny a vee bn | +... 
+(- le ea : Pr, Marj, sre Brgy, K (Holes, ay, ode os 


+(-1) 7 ((-1) Ty. 


Using here the values found above for HTI’(, F(T yf ngs and so on, and denoting 


by Kr" "a the sum of all products of J factors of type yp, Oe o/h 
possible, subject to the conditions that /,, h., ... hy appear in order, that the sums 
contain not less than two summands chosen from the numbers 7), 7y, ... %4_,,,, and 
that the number of summands in /; is not less than in /i;,,, we find that the 
coefficient of 1/N* in the development of 4.) du equals: 
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2i-t-1 
(1) Me & C1) Co" Baa: +3  (- Loo part oa 
x 
—k (or %4-t+h) 
x ( "S > I] (20) 
Ys SE-k+ Bry, ge eileen 


Bers. = (— ye Boi-t4b~1,% 


t—k (or 24-1-t+ k) Tei [ty Benton: a 
x S Dy GN as a) 
1=1 St-k+) Mr pd eae 


i= 
+2 bry Par, = ey Pattee k 
jf, (2) “Po 2 h= 


> 


KS he) 


ee ee 


t—k (or 2i-2-t+h) TI (a) [My Lip. eee le 
x = 2, - REG a ise) 
T= (ES ICEEL ay | Ue rer 0 heee 
i ( ar ) Bry Bry, dh feoraie 


t-1 
+ (=) OS py bry ee My 2% (— 1) Beitr, 
Guia alee <2 wh e=0 
t—k (or 2i-h-¢ +k) Tl es 1 Heng Peng dy eee ' 
x > Ge ak a hy) eas 
t=1 f.(t-k+0) Mop uae eon 


a (= 1 eae D> aa cee ae Sy) (= 1 ye Pru, k 


Jj, (2i—¢-1) 2i-t— ey 


t—k (or k+1) i Ne Use ; Sere (rp. Vy. Ve 
x > > " pone 1K} Tigwdas ns Tren) 
t=1 ff, (t-kt+l) By, & 
An Op lelips k+l ) 
where the summation for / extends to all positive integer values of J from 1 to the 


least of the upper limits. 
By the aid of simple transformations the coefficient of 1/N* in the development 


of H., can be expressed in the form: 
2i-t-1 \ 


Y 
B(-1)' ley 2 (—1)* Cy Boia, t 
n= 
ts1 t-k : 
ee (1) Me KAT hay) 
= Ent. (¢/2) fa JAt= ht) Popp ite ee iyi 
2i-t-] 
h Own 
a EO ee 1 Patek nk 
h=0 
ant. (¢/2)—1 k+1 Tl : | 
ay = (2 yr (re . aa 
a Se (— Ft o = : - KA Ma pay) te. (18) 
k=0 l=1 f, (t- etl) Bry, Pig Era | co ; 
2i-t-1 
0 
ee al Cee oeue eg | 
=0 
Ent. (¢/2)—1 t-2h-2 in | 
i ae — 2i) r (pp a a5 
aP mS, (= 2 Ds > 2 kK Wh Migr + "oy_oy 3) 
k=0 j=0 f, (2t-2k-7) fry iy eee i eye | 
Qi- D+ 2h+j 
x S (-—1) Gin. ; Boi H 
os 2i—at+ek+j Poei-t+k—h,k | 
L=0 y) 
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Noticing that (cf. Introduction, IT, 3) 


r—k-1 
Ps (—1)* C,," 6,2, =0 when m> 2k, 
n=0 ; 
r—k-1 
(HU Cut Braye = 1.8.5... 2k— I), 
1=0 
= all ; h nN er 
d . io ibs 
> (—1)'C,,_, Brae = > Bing Cates 
n=0 g=0 


where B,,, has the value given in the Introduction (12), (13) and (14), we see that 
the coefficient of 1/N‘ in the development of £; du is zero, if t<7. The coefficient 
of 1/N* equals 
i-1 
(<1)? gg 15345 Go — le ea) 
k=1 
IT ei 
2k) laa Bry, One By, 


2i-2k 


x 
S; (2 


7M 


i-k AG 
Me? + MO, eecaloy 


where by Wes is denoted the sum of all the different products of type 


My Hhy «+» Hy, possible, subject to the condition that 4,, ho, ...h;~ appear as 
sums of pairs of the numbers ry, f,, «++ Myeg_ on: 
. ik : 
When 7,=7",=...=7, the aggregate of terms denoted by M « = ) x reduces to 


(2i a a} a 1) (22 = Ik — 3) se) A Bic Tsar 


and 
sige OPS ys 3S Wey gt 
S, (2i-2h) Bry, Mrp, BOS LARP hs a ia 
=1.3.5...(2k—1) 08 7™ wp Qi — 2h — 1)(21 = 2k — 3)... 5.3. ee 
= 103 (Den (2) 1) Orme aee a 
The coefticient of 1/N* consequently becomes (cf. (14) above): 
$2355... (2U— D(a) 
Sunilarly, we find that the coefficient of 1/N‘ in the development of Hei, du 
reduces to 


2i-¢ ) 
t : hoo 
(= 1) IT (27-41) ree (= Ie C ‘st Bois, t 
1=0 é 
Gea t-k lle: 
e S (21-41) r (rp tpt 
1s eke V(r i Katee) 
k= Ent. (¢/2) l=1 f, (¢-k4+1) Bry Pip Aap Aa 
2i-¢ S ; 
S h (th 
x he (SC area ee kent | 
n=0 
Ent. (¢/2) -1 k iD bas 
4 by A 2 : 4 , ! 
fo GS. G1) ae iinet KM the Ha See: 
k=0 ; l=1 f,(¢-kt+l) Mr, ty, eee als Pe 
2 
hc 
OR LC een cermer mae | 
n=0 
Ent. (¢/2)-1 t-2k-1 lee 
A ~ = (2¢41 b . “ 
oie ead atlyta > 2 —— ED 9 Mg Toy op 9) 
k=0 j=0 F,(2t-2k—/j) Bry Pry, ree LAE atch 
9-24 2k+G41 ; 
e ay 
x > (= iy C “of —ot--ok-+j-1 Bor tyk—-Ha, k 


A=0' y) 


Au. A. TCHOUPROFF 169 


When ¢ <i the coefficient of 1/NV* in the development of 4'y;,,) dw consequently 
reduces to zero. When ¢t=7 +1, noting that 


i-1 
I 

Be Cs Cite = 

h=0 I 


] 


2k 

= h e = 

> (- ye Oe Pik = NH 8h4 5) she (2k — ils); 
=0 


i-1 1 = : 
2 (uC) Ge p= SB, Or = 185k 1G ka 1) eh =1)) 


2hk—-1 
g=0 


2-1 3 
= Ble WCC Bienes 
= 


we find: 


(- De Sie we (27+ 1) 24 HT era 


ies cri se ye 
Lo ly. ese tysety Sg | 
fe (27. t—-k+ c ZA 
+[G@-k+1)+3(h-l] & as eS Me | 


Sf, Qi 2k+2) Bers, Re: Aye ans 


(i) 
alr My 2953 


= 


(i—k+1) 


(ih) 
where M,,  , , has the value given above, and M, °°, , denotes an analogous 


9 
sum of terms of type pn, Mig ++ Hig_, With the single difference that in the aggre- 
gate h;_, there occur not two but three of the numbers 7y,, 77, «++ Tyo,» While 
the remaining 22 — 2/'— 2 numbers of this series are distributed in pairs among 


hee Tas cee Wipe eis 


In the case 7, =1r.=...=12i4,, the coefficient of 1/N* in the development of 
FE oixs) du becomes : 


(—1)'1.3.5...(2¢ +1) 36 a +(—1)'1.8.5...(20-1) [1+ 3 @—1)] Ch, ee oor 


2k i-h-1 3 7 


i-1 94 —Ie+ : 
+ S(—1 1.3.5...(2k—1) ee Oy f. Moy Mor C5, ont, 1-8. 5...(22— 2k — 3) 
k=1 


-2h+Q We i-k+1 


+[G@—h4+1)+3(b-D] Cy we By, 1.3.5...(28- 2+ Df 
ee rua C2 1 Sub. 420-8) 


- 0 v oy7— ss) ¢ 9° 
wile dance (20-1) 5 Wisp — pe |) [ian Spon ftp 22," |. 
(Cf. (15) above.) 


(To be continued.) 


MISCELLANEA. 


I. Preliminary Note on the Association of Steadiness and Rapidity 
of Hand with Artistic Capacity. 


By M. L. TILDESLEY, 


Crewdson-Benington Student, University College, London. 


(1) This preliminary note is based on observations made by Miss M. Dalgliesh at the request 
of Professor K. Pearson. As a teacher of drawing in a large school, she had a long experience of the 
amount of artistic imagination and artistic craft in her pupils, and was able to obtain appreciations 
of their other abilities. The categories she supplied for about 60 pupils were: (a) their ages*, 
(b) the number of years during which they had learnt drawingy, (c) their artistic or non-artistic 
capacity, the former being subdivided into imagination and craft, (d) their mathematical, and 
(e) their musical ability. The steadiness and rapidity of hand were to be tested by the well-known 
“maze’’-problem. Three mazes were prepared, I, IT, III, of varying degree of tortuosity. The nature 
of the problem was explained to the pupils. They were to enter the maze at A and leave at Q, a 
continuous pencil track being drawn from the point of entry to the point of exit. The performance 
was to be considered the more excellent the fewer the occasions on which the pencil track touched 
the boundaries of the maze path—such touching being termed a ‘‘bump.”’ The ideal pencil track 
would keep steadily in the mid-path and parallel to its borders. No distinction however in this 
preliminary experiment was made between a non-bumping wavy line and an ideal track. The 
efficiency due to keeping clear of the boundaries was simply determined by the number of bumps. 
The minimum number of bumps of any girl in any one maze was one and the maximum 72. 
Further, the performance was to be considered the more satisfactory the greater the celerity with 
which the track was completed. The minimum time (taken with a stop-watch) of any pupil in 
any one of the three mazes was 18 secs. and the maximum time practically 3 mins. Contrary to 
what might by some he anticipated, there was not a high negative correlation between the number 
of bumps and the time taken. Although on the one hand an over-hasty temperament might lead 
to many bumps, on the other a certain celerity tends to straightness of path while hesitation leads 
to bumping. These points will be more easily grasped from the correlation results provided below. 


(2) A question arises as to the relative difficulty of the three mazes measured in time and 
number of bumps, the average values are: 


Maze I Maze II Maze III 
Average number of minutes taken 2-002 + -043 1-208 + -031 1391+ -037 


Average number of bumps made 20-68 + 1-297 15:39 + -927 31-82 + 1-392 


These numbers, however, can hardly be taken as measuring the absolute difficulty of the three 
mazes, for (i) they are not of equal length, and (ii) they have not the same number of changes of 
direction. Approximately the following hold: 


Maze I Maze II Maze III 
Length of mid-path ... dag a .» 1025 mm. 700 mm. 730 mm. 
Number of changes of clirection a -- 128 94 84 


* Their mean ages was 14-43 years with a standard deviation of 2-07 years, the actual range being 
from 10 to 18 years. 

t+ The mean number of years during which the pupil had learnt drawing was 3-84 with a standard 
deviation of 2-20, the actual range being from 1 to 8 years. 
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and thus we have: 


Maze I Maze II Maze ITI 
Number of ems. described per minute .. 51:20 57-95 52-48 
Number of bumps per ten changes of direction 1-616 1-637 3°788 


Thus judged by time whether absolutely or relatively to its length Maze II was the easiest, 
Maze I the hardest. But against this must be set the fact that Maze I was taken first, and probably 
the pupils in this case proceeded with more caution. Absolutely and relatively to the number of 
changes of direction Maze III was the hardest. Maze I does not seem to have been harder than 
Maze II, if we judge its mean number of bumps relative to the number of changes of direction. 


Our total numbers being few it seemed at first desirable in order to reduce the probable errors 
of our results to treat each trial as an independent event and thus reach a total of over 150 cases. 
This possibility is, however, excluded by the difference in difficulty of the mazes; pooling would 
have produced spurious correlation. We were thus compelled to work out correlations for each 
maze, or to pool the total achievement of each pupil. We have sometimes adopted one and some- 
times the other method. 


To obtain a single general standard of efficiency in maze description we have taken as a com- 
bined measure the inverse product of the time taken and the number of bumps made. We shall 
speak of this as the “inverse product” simply. It receives some sort of justification, when we note 
that the factors are not highly correlated and further when we note that we desire a measure which 
shall increase with efficiency. 


The fundamental problem we had in view is the following: To what extent are steadiness and 
rapidity of hand as exhibited in maze-tracing the result of training? to what extent are they innate? 

Before proceeding to the discussion of this problem we may note the variability in period of 
time and in number of bumps for the three mazes. 


Maze I | Maze IT Maze III 


8.D. C. of V. 8.D. C. of V. 8.D. C. of V. 


-410 +-026| 29-48 +2-04 | 


+:031) 23-91+1-6] | 340+ -022 | 28-15+1-93 | - -02 
5:86 | 15-44 +-98 48-54-13°75 | 


| Time in maze...| -479-+-03 
No. of bumps...| 14:39 -+--92 | 69-59+ 6-22 | 10:29 +66 | 66-854 


| 
| 
| 
| 


These results seem to indicate a conformity with the general law that the harder the test the 
greater is the scatter*, i.e. the weak fail more conspicuously and the able succeed more markedly. 
This is a law manifested in most stiff competitive examinations, or again in the difficulty of making 
marked distinctions in the case of easy papers. We are speaking here of the scatter or variability 
as measured absolutely by the standard deviation. It is noteworthy that the relative variability 
(or the variability as percentage of the mean value) as measured by the coefficient of variation 
appears in the case of the bumps to be less in the case of the harder maze. We are thus driven to 
the conclusion that the emphasis of the difference between the ineffectual and effectual in a given 
task while increasing with the stiffness of the task does not increase proportionally to that stiffness, 
but probably at some lesser rate. 


(3) The first problem to be answered is: How far is steadiness of hand an individual character- 
istic at all? Will the same individual do well in one maze and badly in another? The answer to 
this problem lies in the pupils’ correlation in efficiency in performances in different mazes. Now 
whether the characteristic be acquired by training or be innate we should anticipate a change with 
age. Most innate characteristics grow stronger or weaker with age, and this must be taken into 
account. The following table gives the chief age correlations: 


* The high variabilities in the case of Maze I are we think due to the manner in which different 
individuals attempted a novel task. 
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TABLE I. 
Correlations with Age. 


Characters Maze I Maze IT | Maze III 
| No. of bumps and age __... oa .. | — 687+ :059 | — 532+ -065 | - -549+ -063 
Time taken and age oe spe «. | — 241+ -085 | + -003+ -090 | + -152 + -088 
Inverse product and age me .. | + 4024+ -076 | + 3174-081 | + -467+ -071 
No. of bumps and age for constant time | — -607 + 05 7 - 5639 + 064 | — -542 + -064 
taken 
Time taken and age for No. of bumps | — -336 + -080 | - -101+-089 | —- -112 + -089 
constant 


This table shows at once a very considerable relationship between age and the number of 
bumps made: the steadiness of hand increases with age. On the other hand in the case of Mazes IT 
and IIT no relationship between age and time taken was demonstrated; in Maze I, however, there 
was possibly a slight relationship between time taken and age, of the opposite sense, however, to 
the insignificant values in Mazes IT and III, i.e. the lower the age the longer the time taken; this 
is not improbably due to the novelty factor involved in Maze I. On the whole we may reasonably 
conclude that the relation between time taken and age is not important. Confirmation of this 
arises in the case of the inverse product measure of efficiency and age. The efficiency increases 
with age, but because it includes time is not so marked as in the factor of bumps alone. The two 
remaining correlations indicate what happens if we eliminate respectively the influence of time 
taken and number of bumps. We hardly improve the relation between the number of bumps and 
age, if we make the time taken constant. On the other hand we get one significant but small 
correlation and two insignificant correlations, but all three are now of the same sign if we measure 
the relation between time taken and age for constant number of bumps. It is thus possible that 
there is a very slight relation between time taken and age—rapidity slightly increasing with age 
for a given degree of steadiness of hand. This leads us to the direct problem of the relationship 
between rapidity and steadiness of hand. 


TABLE II. 
Correlations of Rapidity and Steadiness. 


Mazel | Maze II Maze III 
No. of bumps and time taken... = : 052 + -090 165 4 -.088 — 432 + -073 
No. of bumps and time takei for constant age He -246 + -085 | — -193 + -087 422 + -074 
No. of bumps and time taken for time learnt | — -070 + 090 — -164 + -088) — -444 + -072 


constant 


Without regard to age, it is only in the case of Maze III that we can assert that the numher of 
bumps increases inversely with the time taken. Allowing for age the associations are more marked, 
but by no means as intense as we had anticipated. Further they seem to be dependent on the 
difficulty of the maze—i.e. the harder the maze the closer the relationship. A privri one might 
imagine that a slow transit would escape bumps—it is so, but not in a very emphatic manner. We 
suggest that a certain degree of rapidity is really helpful in avoiding bumps; it keeps a straight 
course in the straighter parts of the mazes, while it is rapidity at the angles which is calculated to 
produce bumps. There are probably therefore two factors at work. 


We can now turn to the question of individuality in maze-tracing. We find the following 
correlations: 
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TABLE IIL. 
Individuality in Maze-Description. 

Correlations Mazes [ and II Bee Tand IIT |MazesITand 11 
Rigen urapse meg ee gee ere fe, 1540301) 1-761 020) + “876 4-02) 
| Time taken ate es jae ese .. | + 594 + -058) + -436 + -073) + -807 + -030 | 
Inverse product... 406 sae 308 ... | + -708 + -046| + -661 + -051 | + -695 + -047 | 
No. of bumps, age constant a Tee ... | + 644 + 053] + -648 + -052| + -826 + -029| 
Time taken, age constant | + 613 + -056|.+ -493 + -:068/ + -816 + -030 
Inverse product, age constant : | + -663 + -051 | + -585 + -059| 4+ -652 + 052 | 


These are very noteworthy correlations and it is evident that there is a very marked degree of 
individuality in maze performances, whether we judge steadiness, rapidity or the combination of 
both involved in the inverse product measure of efficiency. These high correlations it is true are 
lessened, but still large, if we allow for age*. The existence of this marked individuality brings us 
then to our main problem. Are steadiness and rapidity of hand—which to an appreciable extent 
increase during childhood—products of training, i.e. of environment, or innate characters develop- 
ing with age? 

(4) How far does the length of time during which drawing has been learnt influence the 
rapidity and steadiness of hand in maze-tracing? The correlations are provided in the accompany- 
ing table. 


G TABLE IV. 
Influence of Time Learnt on Steadiness and Rapidity of Hand. 
Correlations Maze I Maze IT Maze IIT 
No. of bumps and time learnt ... Hee ....| — -216 + -086| — -286 + -083]| — -209 + -086 | 
Time taken and time learnt eas ii ... | — 077 + :090] + -029 + -090} — -010 +- -090 
Inverse product and time learnt ioe .. | + -151 + -088 | + -272 + -084| + -287 + -083 | 
No. of bumps and time learnt for constant age | + +132 + -089} — -008 -— -090) + -089 + -089_ 
Time taken and time learnt for constant age ... | + -053 + -090) + -032 +. 090} — -102 + -089 | 
Inverse product and time learnt for constant age | — -066 + -090) + +137 + -088) + -067 + -090 


Thus while there is no sensible correlation between rapidity of hand and time during which 
drawing has been learnt, the small amount of correlation between steadiness of hand and time 
learnt disappears when we take these correlations for constant age. As far as this material is 
concerned, we see that steadiness and rapidity of hand are not the result of drawing practice, 
but are probably innate characteristics developing with age. This result is so important that it 
needs of course independent verification, but if true its suggestiveness is great. For crafis in which 
these characteristics are essential, they can better be obtained by selection than by training. 

We have endeavoured to throw further light on this point by approaching the subject from 
other standpoints. The correlation between age and time learnt is + -5051 -- -0671, and it may be 
argued that time learnt in the early years of life is not of very great importance, and that this 
possibly accounts for the correlation being rather low. We accordingly confined our attention to 
the 33 children who did not learn drawing before 10 vears of age. The correlation of age and time 
learnt now rises to + -8555 + -0315}. We then dealt only with steadiness of hand, and took as 


* The correlations of times taken are increased not lessened, but we have already drawn attention to 
some irregularity in the relationship of age and time taken. 

+ This is only very slightly lowered if we confine our attention to children who make the same total 
number of bumps, i.e. age and time learnt for constant steadiness of hand is +-8254-- -0374. 


174 Miscellanea 


our measure the inverse of the number of bumps made on the three mazes added together. We 
found steadiness of hand (i.e. inverse of total bumps) and time learnt now had the significant 
correlation of + -4030 + -0755, while age and steadiness of hand gave a correlation of 
+ 4877 + -0895. 

We now corrected these last two correlations for age and time learnt respectively and found: 
steadiness of hand and time Mae for constant age — -0034 + -1174, age and steadiness of hand 
for time learnt constant + :3015 + -1067. There is thus: 

(a) no relation between Pate of hand and time during which drawing has been learnt, if 
we correct for age; 

(U4) a definite relation between steadiness of hand and age, if we correct for length of time 
drawing has been learnt. 

The corresponding correlations in the case of the whole population under discussion were: 


-0900, 

-0776, 

in sensible agreement with those for the special population who had not begun drawing before 
10 years, although in the latter case the correlation of age and time learnt was + -8555 as against 
+ -5051 for the general population. 


Steadiness of hand and time learnt for constant age + -0334 4 


Age and steadiness of hand for time learnt constant + -3731 4 


IT 


As far as these results go they confirm the view that steadiness of hand is an innate character 
developing with age, but having little association with training in drawing. 

(5) We now turn to “craft” and “imagination” as factors in drawing ability. If these be 
correlated with efficiency in maze-tracing, it will not necessarily follow that efficiency in maze- 
tracing is associated with effective drawing training, as apart from length of training. Possibly 
craft and imagination in drawing are themselves in the first place innate characters, developing 
no doubt with age, but not necessarily intimately associated with time during which training has 
been given. 

In dealing with “imagination” and ‘‘craft” the method of “biserial-r” was adopted. Poor 
craft contains the classes “‘minus,” very bad and bad, and good craft, the classes medium, good 
and very good. Poor imagination contains the classes minus, very bad, bad and medium, and 
good imagination the classes good and very good*. 


We have first to note the influences of age and time learnt on craft and imagination. 


TABLE V. 
Influence of Time Learnt and Age on Craft and Imagination. 
Character pair Value of correlation 
Good imagination and age : ih aoe ae — -479 + -094 
Good imagination and time learnt Bob ies oe + -033 + -090 
| Good craft and age i oe: ee ace Bs — :096 + -112 
| Good craft and time lennate hs a ae Sar + -166 a “114 
Good imagination and age for constant time learnt... — -675 + [-060]t 
Good imagination and time learnt for constant age... 363 + [-078] 
Good craft and age for constant time learnt... ane — -217 + [-086] 
Good craft and time learnt for constant age... .. | + :261 + [-083] 


Now the absolute correlations are extremely interesting, there is no relation between time 
learnt and either imagination or craft; these factors of drawing capacitv appear like steadiness and 

* The choice of series was made solely with a view to obtaining not too small frequency in the smaller 
series. 

+ Probable errors of these partial correlations are given as rough estimates for they are calculated on 
the basis of all the component correlations having been found by the product-moment method. 
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rapidity of hand to be innate. The influence of age on craft is not significant, but age seems to 
weaken imagination, i.e. the younger children are more imaginative in their drawing work. If we 
turn to the partial correlations we see, however, the bearing of these results. For a constant age 
imagination is moderately influenced by the time learnt; but for a constant amount of training it 
depreciates more markedly with age. The result is that there is no apparent relation of imagination 
to training. The diminution of the innate character with age is really more influential than its 
growth with training. Again for constant age there is a very moderate influence of training on 
craftsmanship, but for a constant time learnt craft diminishes with age*. The result again is 
that innate change with age masters training, and unless training is persistent, good craft will 
lessen, so that the correlation with age is either negative or insensible. The small influence of 
time learnt on these factors of drawing efficiency is remarkable, and it is highly suggestive to see 
that in certain characters training may only suffice to prevent deterioration, and does not provide 
a marked expansion of efficiency. 

We may now turn to the influence of imagination and craft on the steadiness and rapidity of 
hand exhibited in maze-tracing. 


TABLE VI. 
Influence of Craft and Imagination on Steadiness and Rapidity of Hand. 
Characters Maze I Maze II Maze III 

Good imagination and no. of bumps + -239+-109 | + -187+4-109 | + -252 + -109 
Good imagination and time taken + :005 + -115 + -029 + -114 + +167 + -112 
Good craft and no. of bumps — 163 + -113 | — 109 + -114 — -039 + -115 
Good craft and time taken + 500+ 093 | + -104+-114 | — -308+ -107 | 
Good imagination and no. of bumps for | — :059 + [-090}+| = 092 + [-089] | — -014 + [-090] | 

constant age | 
Good imagination and time taken for con- | — -130 + [-089] | + -035 + [-090] | + -277 + [-083] 

stant age | 
Good craft and no. of bumps for constant | — -272 + [-084] | — -189 + [-087] ] — -110 + [-089] 

age | | 
Good craft and time taken for constant | + -494 + [-068] | + -104 + [-089] | — -298 + [-082] | 

age 


Now these results are of much interest and suggestive for further inquiry as soon as we are able 
to deal with much larger numbers. In the correlation uncorrected for age there would appear to be 
slight relation between good imagination and a large number of bumps, but it is only because the 
younger students are more imaginative. Corrected for age the correlations are all reversed, but 
are seen to be of no significance. Good imagination and time taken cannot be considered to have 
significant relationship either before or after correction for age. Thus the imaginative factor in 
drawing skill is not sensibly associated with rapidity or steadiness of hand. 

With regard to craft there do appear to be significant associations, but they are clearly 
changing with growth of experience in maze-tracing. Uncorrected for age, there is no really 
significant association between good craft and steadiness of hand although the constancy of sign 
is to be noted. After correction for age, it would appear probable that a small association exists 
—good craft having the steadier hand. But the relationship appears to be weakening with ex- 
perience and is hardly significant in the third maze. The same change makes itself manifest in the 
time taken. Those with good craft took the longer time in the first maze and there is quite a 


* Miss Dalgliesh reports that she judged of the craft capacity of her pupils quite apart from their 
age or technical ability. Thus given two children of 9 and 16 years of age whom she had rated with the 
same grade of craft capacity, the elder child would (if teachable) be doing the better drawing work, 
having had as a rule longer training. But this increased technical power was not regarded in the craft 
grading. 

t See second footnote, p. 174. 
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sensible degree of correlation; in the second maze the correlation is insensible, while in the third 
it has become negative. In other words in the third maze good craft has begun to tell. It would 
require far more material and prolonged experiment to be certain how far it is the hesitation of the 
good craftsman over a novel task (Maze I) or the greater difficulty of the third maze which has 
told in the favour of good craft in that case. All we can assert is that within the small range of our 
experiment the slight relation of good craft to steadiness of hand appears to decrease, while the 
relationship of good craft to rapidity of hand is only beginning to develop in the third experiment, 
and is then not of any substantial intensity. We have already seen that there is only a small 
association of good craft and time learnt, about enough to allow for the deterioration of craft 
with age. Hence the slight relationship suggested between good craft and rapidity of hand is not 
necessarily an argument in favour of such rapidity arising from training, it may well be the result 
of an association of the innate characters. 

(6) The remark at the end of the last section leads us directly to the problem of whether other 
qualities than those of draughtsmanship can be directly associated with steadiness and rapidity 
of hand. A priori we think there is much to be said for both mathematical and musical capacity 
being innate*. The former except in the case of geometrical drawing gives small training for the 
hand, but it does enable the owner of the capacity to realise more or less vividly a conception of 
the desired perfect maze description. On the other hand music not only gives much finger practice, 
but in the case of special ability probably signifies an inherited flexibility of hand. In our division 
of the material we have made only two classes—those of the students who possessed marked 
ability in music and in mathematics were separated into small classes from the remainder—the 
mediocre, the non-mathematical and the non-musical. We then applied the biserial method. 


TABLE VII. 


Association of Mathematical and Musical Capacity with Steadiness and 
Ramidity of Hand. 


Characteristics Maze I Maze II Maze ITI 
| Mathematical capacity and no. of bumps | - -112 4-139 | - -216+-136 | — -136 + -138 
Musical capacity and no. of bumps... | — -091 + -170 | — -042 4-170 | — -088 + -170 
Mathematical capacity and time taken ... | — -608+ -106 | — -390-+-126 | — -214 + -135 
Musical capacity and time taken ... | — 803 + -162 | - -233 4-165 | — -029 + -170 
| Mathematical capacity and no. of bumps | — -012 + [-090] | — -148 + [-088] | — -049 + [-090] 
for constant age ke ae 


Musical capacity and no. of bumps for | — -042 + [-090] + -012 + [-090] | — -042 + [-090] 

constant age 
| Mathematical capacity and time taken for | — -592 + [-059] | — -396 + [-076] | — -237 + [-085] 
constant age 
_ Musical capacity and time taken for con- | — -290 + [-083] | — -234 + [-085] | — -046 + [-090] 


ate © | 
stant age | 


Now none of the correlations of mathematical or musical capacity with steadiness of hand are 
in themselves significant, but as all six of them are of the same sign, we may possibly assert a 
slender absolute relation between both and steadiness of hand above the average. On the other 
hand the rapidity of hand shows at first definite and in the case of mathematics marked relation- 
ship at first with both mathematical and musical capacity. But this relationship seems rapidly to 


* The correlation of mathematical capacity with age was + -175 -- -137 and of musical capacity with 
‘age + -098 + -169, which are satisfactory as showing that the teachers really judged capacity and not 
knowledge; they are as far as they go also some evidence for mathematical and musical capacity also 
being innate characteristics. Direct evidence for the hereditary character of musical capacity may be 
found of course in the pedigree of the Bach family. It is less demonstrated in the case of mathematics. 
but the Gregories might be cited, and possibly one or two recent instances will occur to those familiar 
with the Cambridge Tripos Lists. 
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diminish with experience of maze-tracing. It is probable that the mathematicians had at first an 
advantage which was not maintained. It may be suspected that they realised better at the outset 
what was required; an appreciation also of time taken may have been a factor at the outset with 
the musicians. But both these advantages seem to diminish as the non-mathematical and non- 
musical gain experience. The above remarks refer to the total correlations; when corrected for 
age we see that the conclusions for steadiness of hand are confirmed; even for constant age it has 
no sensible relationship to either mathematical or musical capacity. Rapidity of hand does show 
relationship with mathematics in a more marked, and with music in a less marked degree when we 
correct for age. But again both the associations are lessening with experience; for the third maze 
the superiority of the good mathematicians is only very moderate and the superiority of the good 
musicians has become insensible. There can be little doubt that the more marked superiority 
for Maze I was due to a better appreciation of what was needed in a novel task. 


(7) Conclusions. The material dealt with is admittedly slender, and was analysed only as a step 
towards more elaborate returns; it was made in order to determine what additional experiments 
should be tried. Our conclusions are therefore suggestive rather than dogmatic. We have not 
been able to associate in a marked degree rapidity and steadiness of hand as exhibited in maze- 
tracing with any training; we more than suspect them to be innate characteristics *. Good craft, 
mathematical ability and musical capacity seem to some not very marked extent associated with 
rapidity of hand, but it is noteworthy that even in these cases the advantage was rather an initial 
one and tended to weaken with experience. An apparently noteworthy point, which is well worth 
confirmation or contradiction, is that continued training may only just suffice to maintain a grade 
of efficiency, which deteriorates with age. It would be of much interest to demonstrate that 
training in some cases does not create or even develop a faculty, but maintains it at the higher 
range of efficiency which belongs to an earlier age. It is possible that the teacher cannot develop 
imagination in the later stages of youthful growth, but may be able to preserve the greater 
imagination of the child by proper training. Certain faculties may be most intense at certain 
stages of growth. If education seizes upon them at this age and maintains their then intensity, 
we may be apt to overlook their history, and suppose them created by the educational process. 
The point is worth a direct and more intensive investigation. Here it is only a suggestion. 


I have to thank Professor Pearson for his assistance during the preparation of this paper. 


II. Sur les moments de la fonction de corrélation normale de 
n variables}. 


Par SVERKER BERGSTROM, STOCKHOLM. 


1. La fonction de corrélation normale de n variables peut s’écrire Vaprés le théoréme célébre 
de M. Pearson} 
1 v=na=n R 


i Bt Ky Kc 
Z=— @ sone ig= ena : de eR ee ohn hee ri (1), 


ua 


(Qr)2 0109... Fn VR 


[* After thirty years’ experience in teaching in a drawing office I think it safe to say that within a 
fortnight it is possible to assert of the bulk of freshmen engineers whether or no they will be good 
draughtsmen at the end of their two to three years’ course. The power of rapid, steady, uniform bold 
work is there in germ or it is not. Knowledge of method and accuracy of result may be acquired, but 
only to a minor extent can anyone acquire that which distinguishes a good from a mediocre draughts- 
man. K. P.] 

[+ The present paper reached the editor later than the three other memoirs dealing with allied topics 
published in this part of the Journal, but the methods adopted are of sufficient interest to justify its 
appearance in association with those papers. | 

t Voir K. Pearson, Phil. Trans. t. 187 (1896) et t. 200 (1903). 
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eS eycrrene cial 
ae IL coccee Ton | 7 
oll Palle SSF eee eee eae 50) Pall Zalll scocoscenon Pee ate een cruoncca( 2). 


law, Unies Bedoooel | 


et R,, est le complément algébrique ou le mineur de l’élément dans Ja ligne pieme et la colonne 

gieme, Les variables _X,, sont mesurées & partir de ses valeurs moyennes et o, en sont les dispersions; 

rq désigne le coefficient de corrélation entre X, et X_, ot 
"pq = Taps 


Tipp le 


Le moment Paras sah 


est, par définition, ies [X,%X.0... Xj" 2dX, dX, dX, tv 


Cest la valeur moyenne de DG AGL ane Gets 


Pour simplifier ?écriture nous emploierons des coordonnées normales 


oO 
p 
Introduisons, de plus, les notations 


il 

aS ye 
JR 

La fonction de corrélation devient 
LESS 
z — S$ DDapgUpx 
pees eMart ticornoscococod= (EI: 
w 
(27)? 


Notre but sera a calculer la valeur moyenne 
| Oy Og ap 
AVENE ads,“ o00 Cam. Ib 
2. Considérons Pintégrale 


ie eel a 
r= [Jo fa Lola ea ec dc Aton Aan. 


Ll est facile de vérifier que cette intégrale est uniformément convergente, pourvu que 
| Ang |>k>0, 
k étant un nombre fixe. 


En dérivant Vintégrale J par rapport a @,,*, on aura 


f Cay 2 Cp Ap+ Oq+1 eo 
J\= | oe fa CE RP ae Ca FRR Aen Petas hin e Goonoudcaoocud (lc 


Done, si J est une fonction comme de a,,, on pourra obtenir, par une dérivation, J,. Il s’ensuit 
que notre question se raméne au calcul des dérivées par rapport aux coefficients ap, de Vintégrale 


= ce | devas BAEC he ee iicctcenonnednanonanacrcasaonsocd ilo 


dont la valeur est, @aprés M. Pearson, 


ed I ee EP PUR Nn OM feisissoocol LU, 


ou D=||Apq | 


* Tl faut se rappeler que @pq=Aqp- 


~T 
Ne] 
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3. Introduisons maintenant les notations 


A 
ee g He Re oa tate, Docawencauantevesncacsee (EL); 
ited Ott, a, a, me D, 


iia = g g Pr 
Pr eileto me NGG a One x 
3 4 sae Lee, Y% PyIy “WpPy 


(hr) HN 0 0 
Prepay nytg= HE (ga — bag —) cocvenscesevesscesstenseeenee( 2) 


Ey WZ 
1 Dl 


Nous dirons que l’on calecule, par ce procédé, la dérivée symetrique de F. La définition sétend 
au cas oll p = g, pourvu que lon convienne que 


/ OF 
ee Opp 
Dérivons ainsi successivement (8) et (9). 
, oe re ee Ga py: 5 
la = M [%p, %a,] Ane D1 Mascaeeraeseovassteer eee Loy 
77 3 Zo , , C4 7 
IE Pyy? Pp, = [%p,a, Rial" Sins D194 P ry ~ Saaae Dy dy? Poi rreeereeee (LA) 
On est conduit a poser, tout généralement, 
7 sk-1 
M [Xp 4, ++ ® py q,| = UE. {(2k ~1)113F — (2k-3)N2E +... 
DQ acpi) NS Oo DMT eh DEMS cssnees (15), 
ou, pour abréger, 
(29+ 1j!!= (29+ 1) (2p - 1) - 3.1......... Ren Haase src on ..(16), 


4 2 R ay A iii 
et ott 2), désigne une somme de termes contenant 7 dérivées symétriques divisées par D et dont la 
P ; ‘ : 4 
somme des exposants svmboliques est égale 4 k; il y a dans >, autant de tels termes que l’on 


puisse grouper en 7 groupes k éléments, en ne tenant compte de l’ordre des groupes égaux. Ce 
nombre est 


i! S 
i, = 2 eagle as Neat seein cee cise coterie Seewente ites 

ou Qpriage eee | 
Mya, + Mgag+...=k eatece mene eure esis Rewsiosete season LS))s 


My HM, +...=4 | 

La formule (15) peut se vérifier par la démonstration par récurrence. Nous n’y insistons pas ; 
elle ne présente aucune difficulte. 

4. Il faut considérer une dérivée partielle dont se composent, en premier lieu, les dérivées 


Avena O =) 
symétriques et puis les sommes 2h 
. i) A 
okD 


Pods 


La dérivée 


z 
Op, a, 0a - On 


Tele 
doit étre, évidemment, au signe prés peut-étre, égale au complément algébrique du terme 


Apa, Upyay ++ Up ay,* 
Le signe s’obtient par ’évaluation du nombre des inversions dans p,, py, «+», Py—Soit 7—et 


dans q,, Jo, +++» U-—Soit 0’. 


Or, en vertu du théoréme bien connu sur la connexité entre un mineur d’un déterminant et 
celui de son réciproque, et & cause de Videntité 


1 
D=5, 


180 Miscellanea 
ce complément algébrique peut s’écrire 
W192 +++ Afr 
1 pad... aK Rurk-1 a Pa Pk 
Rr-k Did. Dye R é 


ot le déterminant au numérateur est formé par les éléments communs aux lignes 1, Po, ... Dg 
et aux colonnes g,, do, --. gq, du déterminant R (2). 


oD Dido ++ Wh 
Done PR oe SRL eS nniaiobdobenoondovcosceal I). 
Ody a4 mae Op a4, R ) 
On pourra d’ailleurs éviter la considération du nombre des inversions en substituant a 
P 
Wido+++ Vk 
Di D2 Dk 


un déterminant dont la diagonale principale consiste des éléments 


, 
Py? 


DyIy > aro) Poly? 
et que nous désignerons en échangeant la lettre # contre Y. Permutons, en effet, dans le déter- 
minant Q des lignes ou des colonnes: on ne change que le signe. Les indices du déterminant R 


navant pas d’inversion, on voit que 


NGa++ Ip ay) WNd2++- Vk 
pe ies 1) Bn, py 
; okD 1 yee Uk 
On aura done enfin Sane ee a ee [ Escntslgne ceeeionaen eee (OU 
CET py DOGG yr EDs 
aetna, — Hinetie, Wo Ua ahy 
LG iiss, PO aie 
"Dd Pydg 1° PLE 


Si deux des nombres p (ou des qg) sont égaux, le déterminant (21) contiendra deux lignes 
(ou colonnes) égales: il s’évanouira; et, en effet, la dérivée seconde d’un déterminant par rapport 
a deux éléments appartenant a la méme ligne (ou & la méme colonne) est nulle. 


La dérivée symétrique du kiéme ordre consistera d’aprés (12) de 2* dérivées simples, telles 
que (20). On les trouvera en faisant changer de place les p et les ¢g 4 méme indice de toutes les 
maniéres possibles. En désignant ces changements par S, on aura 


k 1 dove Me 
ps ) os (on% qk 


2 P\P2-+6 Dk 


hee ie \ lieu ee eee 


I faut remarquer que le terme formé de la diagonale principale de (21) se reproduira une fois 
et une seule dans chaque dérivée simple; il y en aura 2* dans la dérivée (22). 


5. Revenons a lexpression (15). Elle va se simplifier. Tout d’abord 


20 = JD. 
Puis, considérons un terme dans 2s soit par exemple 
D™ pm — pli) 
D Dee apa BA RSSEBSHOR Besbpdndnohonbsecodnooccsasend. (2S) 


M+ Nat... + = k. 

(n) 

eS a Ga, Fa, *** Lay ) 3 
D ‘Da,Pay** Pay 


Cest un polynéme en 7, homogéne et de l’ordre n. 


D’aprés (22) 


Cela étant, on voit aisément que l’éxpression (23), homogéne aussi, est de l’ordre 


Nyt Ny +... + N=k. 
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Le terme Tpit? "Pally 17) MDpdy, TIT Tt titers tte etesteeneeetecees ees 


y aura le coefficient gmtet Mi — ok, 


Done le terme (24) obtiendra dans ey le coefficient 2” 1, , 

Généralement, en mettant dans l’expression (15), on voit qu’elle est une fonction homogéne 
de lordre k en r. 

La nature du probléme exige que cette fonction est aussi symétrique. 

Permutons, en effet, les indices du premier membre de (15); le résultat ne devant pas changer 
pour cela, le terme (24), y permutés les indices de la méme manieére, conserve le méme coefficient. 


6. Tl ne nous faut donc considérer qu’un seul terme de (15), soit (24). 


Ce coefficient est 


(Qe Ill g’ = (2k—3) (ea 1+ (2h 5) 112g" 7... a 9"-1 gl 
k k k k 
t=k ane 
eS (Ihe Moron yar ge cas (25). 
i=1 


Je dis que celte expression est égale a Punité, quel que soit k. 
Je vais employer le raisonnement par récurrence. 
L’assertion étant évidente pour k = 1, je la suppose démontrée pour /, et nous allons voir 
qu’elle subsistera encore pour k + 1, c’est-’-dire que expression 
i-k+1 . A Be 
23 (-1) @k-2i +3) 1133 geen ere (26) 
sera égale a lunité. 


. i 
Envisageons d’abord les nombres ¢, . 
a 1 
Cherchons In en supposant connu g) . 


Ou Pélément (n + 1)iéme doit étre placé dans l'un des 7 groupes formés de » éléments: il y en a 
ct 
ig. 
éventualités. 
Ou cet élément seul doit faire un groupe: les autres éléments formant alors (¢- 1) groupes: on 


aura ainsi 


i-1 
. In 
éventualités. 
pS i sags i ta On 
D’ot la formule 1 al (27). 
En convenant que ga. = 0, 0, 
cette formule subsiste encore pour 
1=n+1 
et ale 
L’expression (26) peut s’écrire maintenant, en vertu de (27), 
i=k+1 ee -_ 
BS (1)? (2k - 2643) 1122 (k— 4 4.2) gh TF 4 gh 
i=1 c é 
t=-kt+1 ; ; ; k-i+2 
= S&S (-1) 1 (2k -214+3)!1!2'-! (k-1+2) 7, 
t= 2 
i=k : aoe 
+ & (-I)*? (2h -— 20+3)!! 2-2 9 ; 


i=1 


Changeant dans la premiére somme 7 en (7 + 1), on trouvera enfin pour (26) 

izk ee 
Z (-1)1 (Qk-2i 41) 1121 {2 (k- 641) + (2k-2i+3)} gO 
t=1 fe 


Ce n’est autre chose que (25); done la somme (26) est bien égale & Punité. ©¢.Q.F.D. 
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7. En tenant compte de ceci nous écrivons pour (15) la formule générale 


M [tp %p, vee UK 1 Xx] = Strap 


2 ay "Pasa, Pay, Pay. 


On obtient les indices des 7 dans cette somme en groupant les nombres p en k groupes deux a 
deux sans tenir compte de ordre ni des groupes ni des éléments d’un méme groupe. La somme 
(28) portera done en tout 


(2k)! 
oF i! 


=(2k-1)!! 
termes. 


Si quelques p sont égaux, on doit les imaginer affectés des indices; au résultat final on se 
rappelle que 


Soit, par exemple, Di—h)s —wec— sors 


On retrouve Ja formule connue 


fol —lg? 
— tke 2" dr =(2Qk—1)!!. 
| 20 


8. Envisageons le cas général 
IM, [ecto n Pe 


D’aprés la formule (28) cette valeur moyenne peut s’écrire 


i=pj=p (n) 
S-Ay, TN Tbe iy aj one accuse oc eee senete nee eeemeentteceeen (2c) 
n t=17>% 
ou ae sont les solutions du systeme 
C43 + Cyo + ee + Cy = AY 


Cyo + Coq + eee + Can = Ag (30) 


lip + lop + os + Opp = ap 
e,; devant, de plus, étre des nombres pairs. 
Considérons A,,. 
On doit ordonner de toutes les maniéres possibles a; éléments en p groupes contenant respec- 
tivement e,;, 29), «-+) pj éléments, en tenant compte de ordre des groupes; il y en aura 
a;! 


toa a 
C17! Cog l -.- Cpy! 


maniéres. 
Puis on doit accoupler e;; éléments avec autant d’éléments; c’est 


ei; ! 


manicres. 
Enfin on doit ordonner e;; éléments en $e, groupes deux a deux, en ne tenant pas compte de 


Vordre des groupes. On en aura 
(ey = 1) it 


possibilités. 


D’ot, aprés des réductions, 


ee NE oe ciara eee Ode 
sayy el 
ai a) 


M [xy Lo” ... ty PJ=S vat a0 
=1j> 
ot e;; sont les solutions du systéme (30) et ott 
ej !t= ey; (e;- 2)... 4. 2, 
Ove 
Traitons comme application le cas p = 2. On voit que ey. est au plus égal au plus petit des nombres 
a, et ag, soit ay. Il fait 
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! ! ell : 
, a, a, ay: a, a: ag (ag ) acs o 
if fa. 2?) = r seo eee 
ot (Gracy oe (ay —ag+2)!! 21! z 
! » (ag —1) (ag—2) (a. -3 = 
ieee AN Ga UNG 2) ae 9) ye Geese Sn, aM Or (32) 


+ - = = - ——s 
(a; —ag+4)!! 4!! 
Pécris enfin les moments jusquau sixiéme ordre, en employant la notation 


igs a : 
(Cape tree! a Mey con hill 


Bg = L; 
By = Nie. 
Ba 3: 


Bs, = Big = 872, Bog = 277, + 1; 
Bory = 279713 + M235 
Briar = "12% + Ty3%24 + Til o3- 
Bg = 15; 
Moe Sra 2n a tos, Sag — OF 15 + O%aes 


@ 
ox 
g 

| 


Bay = 12713 + 8723, Bye. = 67" y2e713 + Gryer23 + 3713, Bove = 8713723 + 27742 + 2771, + 27%. +15 
Bit 


Or yislia + BPP 34 + BP y3%o4 + BP yyh235 Poors = 277 12% 2a + AM eT aon + AM ean + 2 
ae 27 o5%o4 + 345 
Bora = 20 yo 345 + 2 po a'35 + 2 oP isl34 + Wy alos + 2Wisisoa + 2 aM soa + Voalas + Coal’ss 
+ 95"343 
Braga = 12273456 + 719724756 + --- (15 termes). 
Les moments jusqu’a Pordre 4 a deux variables ont été calculés par M. Pearson et puis par 
M. Soper qui a donné une formule générale pour les moments & deux variables; Biometrika, t. 1X, 
1913, p. 101. (On doit remarquer cependant que sa formule (xxxIt) est atteinte (une erreur 
typographique.) 
Le cas de trois variables a été traité par M. Wicksell dans ‘‘The general characteristics of the 
frequency function of stellar movements, etc.”’ Lund 1915, p. 11. 
Enfin M. Isserlis a déduit notre formule (28) pour le cas 2 = 4 ou de quatre variables dans 
Biometrika, t. x1, 1916, p. 189. 


III. Formulae for determining the Mean Values of Products of Devia- 
tions of mixed Moment Coefficients in two to eight Variables in 
Samples taken from a limited Population. 


By L. ISSERLIS, D.Sc.* 


A. Let py, p3, denote the product moment coefficients, referred to a fixed origin in a sample 
of size n extracted out of a population of size N, there being four independent variables 2,, xy, 
%3, %,. The mean value of p,, in many samples is Pj), the corresponding product moment 
coefficient for the sampled population. Let dp, = py, — Pj,, denote the deviation of the moment 
coefficient of the sample from its mean value, then 

Mean value of dp,. dp, in many samples 


= X (press — P12P3a)s 


N-yn 
ee Ne 
and pjo34 is the product moment coefficient with respect to the four variables. 


where 


[* Dr Isserlis sent me a paper containing the results of the present note with others accompanied 
by proofs in the course of 1916. It has been impossible to publish that paper so far, but it is only 
fair to him, having regard to the fact that other investigators are now entering this field, to publish 
his formulae in association with the memoirs printed in this part of Biometrika. Eprror.] 


184 Miscellanea 


This result gives many particular cases if we identify two or more of the variables. For instance 
Mean value of dp,2 dp22= (pys2 — P12 P2) é 
E n 
where p, is the second moment coefficient with regard to x, ie. p’, in the usual notation for one 
variable and so forth. Also 


Mean value of (dp,z)* = . (p11 — py"). 


B. Similarly if there are 6 independent variables x,, %, ... %5, U- 
The mean value of dpy,y dpz4 IPs6 


, 


XX ' 


= f ; ¢ Pe ee 
=p (P123456 ~ Pi234P56 — Psas6eP12 — Pirs6P2a + 2D 12 P31 P565> 
, , N - 2n 
where ==> 
N-2 


Particular cases are 


, 


Mean value of (dpy)® = os (ps — 3p)4 Dy + 2p), 


or Mean value of (dj’,)* = os (ug — Bug's + Qy’.°), 


and Mean value of dpy.dp25¢Pa = 1x [P122232 — P1223 Pai ~ Pes Piz — Ps1%2 P22 + 2719 P23 Pai l- 
C. Let there be 8 independent variables x1, v2, ... %7, Xs. 
The mean value of dp, dpo5 UPs5 P75 
= we [ P1234 Pse7s + P1256 Psazs + Piz7s Passe — Pizea Ps6 Pas — Psozs Pi2 Pa 
— Prose Poa Ps ~— Psa7s Piz Pss — Pr27s Psa Pss — Psass Piz P7a + 3P12 Psa Pso Pr) 
; 


xp 
ar oh [ Prosasezs + Pisa Poors + P1256 P3a7s + P1278 P3456 — Pi2zas6e Pos — Piraazs Poe 


— P125678 P3a — P345678 Pre); 


where gh = 2 - 4y’ + 3x” + 4y’/n - 6y”/n, 
e —1+ 3y’ - 2y" — By’/n + 4y”’/n, 
and x” = (N - 3n)(N - 3). 


When the sampled population is infinitely great 
oe =x=xX =X =L GH=1- Ain. 


As a particular case, 
/ 


Mean value of (dpy'=*$ (3p,2 — 6pys py? + Spy4) + ae (pis + Spy? — 4py5 py), 
a a 


or in the notation usual for a singie variable, 


Mean value of (dp’s)* = we (By’,2 — 6’, pe’? + 3p’y4) + Me (we + Bp’y? — 45 H’o)- 
When the sampled population is normal the results of (A), (B) and (() can be immediately ex- 
pressed in terms of the correlation coefficients 715, 713, «+773, by means of the formulae established 


on p. 138 in the current issue of this journal. 
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ON THE MATHEMATICAL EXPECTATION OF THE 
MOMENTS OF FREQUENCY DISTRIBUTIONS. 


By PROFESSOR AL. A. TCHOUPROFF of Petrograd. 


apgotan INStis, o> 
own 


DEC 23 1913 
CHAPTER Il National Mt Ce 
I 
(1) Let us put 
12 
Ey = [X; o X (yy |" = Vy, (N)crerreserscreesceecsseccoens (1). 
We have Y4,(N) = 0. 
Noting that Ge aX = a ett, 


we find vw) = B[X;-Xw) | = L[X,- XI. 


Replacing X,— X,y, by (X,—m,) —(X (yw) — m,), we find 


rol . - 7 . = . . 
Vr, (N) = Me + = (— 1) C0 B(Xy — my) (X wy — mY + (— LY Be, cy: 
Jj= 


; 1 ee, 
But [X (w) — mf = Wi (eos m,) + (N — 1) (X 1) — mm), 
h x Se 
where (N-1) = Noma os i 
Hence | 
“eave ve h h | 
Vy, (N) = Pe + = Oe = Cf (NM 1) Mya oa, —1) + (— DY br, 
— 


| 
=1(— 1)! rej N= 
mit SS int Ss & 1p! ea hess 


(iy Nr Fe | + CP (Nt ten Pn, (y-1) + (N — 1)" p,, wo} ie 


N-1y S 
= a |b te (— 1)’ (Ore Hr-h Fh, w-yt(- 1) Hrcr-o} 
N-lv ¢ 
= (AGA) EUR OE pra mi arn 


J 
On the other hand, replacing X,—X(y) by 
J Bet Ni 
N [CN = TX CV ety = ars © [X,— X~w-1], 
we find: 
N-ly 
UN (Az) 2 (a OP pee aL ee) a ea (3). 
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Replacing in (2) quantities of type j,(y-1) by their values in terms of the 
quantities « (cf. Chapter I, formulae (15), (17) and (18)) we obtain : 


Ne Ne ah "SI 1 
rn, ay=( N ) {be +r S Cs . Horan Ci Z1y4, | (- 1) Br—isy, j Tn, jeg 


i=0 


2h+1 Lae ot 
= sc. C; b2 he, > Sh(N = 1H = ee 1) Brin Pons, h-i+j (4), 
r-1 j 
a6 ek (N—1y lyr = (= ly Brij / a 
iNe= 1 or+1 \ 
Vor+i, (N) = ( —z-) |b | 
ee Qh 5 1 
- pa Ord tat ah WSR? = (— 1) Braisi,j Pon, nmis 5), 
r—-1 Dh+1 h-1 1 ce 
~ = Csi Meor—2h = (N— — 1) , S (= Ey Le) —itj,j Pops, ves 
r—-1 1 & ; | 
= 2 Wop *, (—1) Bp -igs,j Porta, it } 
and, hence, 
Vy, (N) = hr — FF lr a trl] Mr-2 He] f 
1 1 ©). 
a We jae Br — 118) ppg fg — $7) ppg s+ FTA ppg us| +... 
As N increases the ratio 2“ tends in this manner to the limit 1. 
iP 
(2) Putting r= 2, 3, 4, 5, 6, in (2) we find without difficulty 
N= 1 
M2, (N) = yy Ha = Baan 
(N —1)(N—2) 3 2 
V3, (N) = NW: Bs = Ms — 77 Ms = Ws 
aes 2 ) 34 2|* 
V4, (N) = bs — a by — Bo? ] + Hye a — Spe ile = [Hs =< ~ 3p; | 
V5, (N) = ie 2 Ms Me] ras Ne Die Bis Mo| — De 8h3 Me] 
(7). 


+ = [os — 10s fe] 
= 3 5 2 3 
¥6, (N) = Hs — 75 [2Hs — 5 pts Me] + qy2 L8He — 15 os He — Aes? + Opa ] 


5 re 
— py l4Hs — BB pts oe — 16 pug + 42 p08] + aa [Bp5 — 36, me? — 225° + 63 p10" 


Seat SS See ee eee 


5 
— ays Les — 15 pes fe — 10 pus? + 800") 


* For footnote see page 187. 
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(3) If the law of distribution of the values of X satisfies the condition : 
fa — 0; ati Oh Dee Op, 
then, as we saw above, feisi(wy=0, 7=0, 1, 2,...00 and, as appears from (5), 


Voigs,(y) = 0, 1 = 0, 1, 2,... 00. When the law of distribution of the values of X is 
Gaussian and p,=1.3.5...(20—1)p.', c=1, 2,... 00, then, as we have seen 


Poiwy= 1.3.5... (20-1) pa" * and, consequently, 
N—1\% 2 oh 
V9i, (N) = (=) = On Moi—eh Meh, (N—1) 


_(M HU" & pu 18.5... @i-2h-1)1.8.5...(h-1) 
=( nao 2 Oe) - 


N—-1\" : il 
=( ) 18.5 20 1) ay S = eae 


\ 
| 
| 
| 
oo CF i Poi = (N — 1)! wai, cn : 


II 
(1) Let us put 
, Le 
Yr) = We [Xi - Xo] 
o ap Thnnee senses seeeeneenes (9). 
Ww”, = 2 (Xi-Xwnl'}"| | 
We have: 
vs 1 (1) ; 
EV’, on = Yr.) = W Wow) | 
, Mm | 
Ely’, al” = ali : eee (10). 
y ? sig se i oe 
Ely. ay — ran]? = ae 1 amb Om We oxy Yan | 
(2) When m=2 we find: 
® _»s Sry i 
Wan HIS = [xX og 
Beet) 
=E2 2 [X; -Xoyh + BS 2 2 [4i~ Xonl [X;—-X i] | 
jt 
= Nv2p, (yy + nn by) 2 | xX Xonl [X.-— Xi] 


* The quantity v4 (x) can also be written in the form 


N —-1) (N-2) N-1)(N-3 a(t = Deg 
Oe ) ta Buy") +3 ro Mees 


¥4, (N)= 


R. Henderson (‘‘Frequency Curves and Moments,” J. I. 4., 1907), while giving correctly the values 
B2,(N)s #3,(N)s M4,(N)y ¥2,(N) aNd v3, (1), erroneously gives (p. 435) : 


N-1)(N-2 N-1)(N-3 2 
V4, m= Ft Te - t a aT 3 p12”). 
3(N-1)- , 
The true value of v4.) exceeds the value obtained by Henderson b —a 


13—2 
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reviiare Niji oe 

we have | | 

[X,— Xun] X2— Xen] = ya LV 1)(X,—m) — (X= m) 

— (N= 2)(X (yay = 1m) [CV = 1) (Xa = ma) — (Xa — ma) — (N= 2) (Kava = m,)]} 
=/(%,X)- (Py = [(X1 —m,) + (X2—m)] |X ww y=) (ee = [X (v-2) -—m P, 


where 


F(X, X2) =[Xr— 7m] [X. —m,] — SSS Xe —m,)+ (X,—m,)}. 
Hence: 
BX, — Xi) [Xe- Xo] 
bee N-2 
=8 (ft, Xo 3 Cp cher. xar Ge) 

x {[((X — my) + (Xe — m)] [X w-2) — 2m] — [Xw—2) — ml?) 

: N—-2\? - Tate 

= E (F(X. Xl +0 (ee) oa,ar-n BK XO 


a OR ee LN Ne eeu 4 es 
43-3 OO (= =) sin, ev -2y EL f (X,,Xo)}""* [(Xp— my) +(Xa— m4) PE. 


h=2 k=h N 
But (see Chapter I (15) and (16)) 
Ent. (2)-1 1 


fa AT => sy s 1 k sted oh k A +9 
Bay? = (N — DA Ey Ey eye Ge y Ban, GF 5) ith k, Ent. (E) i+, 


5 ee  & 2 Nee 
Eif (X,, X,)}* => > (= yr! Ne y ct ap Hosta Ms—f+g> 
S=0 g=0 


E{f(X,, X2)}* ee —m,)+(X,— m)Pr* 


Sh 2h+2f-k GQaaly ail \f 
7 
=> py (— WOO: h Con -497-k Wile Msthaf—k—g Ps—h-ftg- 
f=0 g=0 : 


Hence: 


7 Te r N- 
N(N-1) BX, — Xen (X, —- Xin = Z xe vy pO Cyt trts 


rel (N 1)" (N=2) 
+ O, py rae (j ee C1 Oop Mrmr feg Mra ift9 


* The development of E {f (X,X»)}"—*[(X,— my) + (X2—- my) *-* in powers of N does not contain 
any terms of higher degree than N° while at the same time the development of uj, (y_») contains no 


terms of higher degree than To obtain Ely, (Ny) 7, (wy)? correct to 1/N# it is con- 


1 
‘ k+1) * 
N kat. (=) 


sequently sufficient to carry our calculations as far as k=2t. 
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r—2 2f4+2 ; N pee i ae, (N— 2) 
2 uP 
af OF Ke e i (- ye Ces Cas ( Nort braf—g br-2-f+9 
S=0 g= 


k(orr) r—h 2h+2f-k h pyk-h vf g 
Ss we k+f+j 
S Bs bee CO, Cr 3 Os op 
3 z=0 )=0 h=Ent k+1 f=0 g=0 
a 


: k+l\_. 
(N— 1H (N — 2) Ent. (F*)-# a 
eS Neth Brant. (5)-é+i, j +k, Ent. (E) iti Mr+h+f—k—g Pr-h—f+g 


€ 


= N? pw? — (per? + 203" [Pings Pra + Hy] ae apes be — 20? [Mr rae + H,-1] fo} 
+ {OP [Apa Bra + Ayr) +e [2brss bro + Sérgr Mra + 6p,’ | 
— 80,7 pyr pa 207) Oya [Hr More + Mya] Me 
— 2077 One [Mts Mr—s + Aor Meme + 3 py 7-1] Me — LAC? pp Myo Me 
— 140? 7,1 pe — 4c? Pr-i Pr—2 B3 
— 28 [My Mrs + B pra Mya] ls + BCP WP pig Me? + 18C,9 [Mya Mra + Pe pao] fo” 
+ 608 [Mp Bria + Aya yes + 3p’ ,..2] pee} +.... 


Substituting in (11) and replacing v,,(y) in (11) by its value in terms of the 
ws by formula (4), we find after reduction : 


W = N22 + N {pop — (27 +1) we — QP png pa 


) 
, (N) | 


+ pg fy +P (1 = 1) py fps Ha} 


iz {2a — 7 (2r— 1) papa ba — 49? fpr ra — 7(897 + 1) py? — 17 — 1) Mp ys Mrs | 
+ 37? wp Mote (7 — 1) (40 +1) py Myo Ho FO fp ros Mo | (12), 
+9? (7-1) pra roe Ms 


rl} 9 9.9 9 9 9 
a eon Mp rs ls Br? (r — LP pepe oe? — 9° (7 — 1) (7 — 2) php Mrs Me” 


Ste 


yl] z 
eee hrs ut +... 
and hence 
P Lee) 2 \ 

E [vs 7) — OP = ae We, ay — >, ay | 

1 7 | 
= ay (Mor = Me? — 20 pers Mra + 7° Wa Ma} | 

1 és 
— 7 27 pop — 7 (27 — 1) pope Me — 7 (7 — 1) rye Mpg — AP? gr Mp (13). 


+15) pas Mpms Mo i 
—7 (+2) wP+2 (P+ 1) 7 (9 1) Be Mpae b+ 30 Wp ho F.°(9 1) opr bya Ms | 


r(r-1ly , 
= 18 (P= 1) (P= 2) pop prs pat — Pe, pt ie 
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(3) When n=83 and n =4, we have similarly : 
We ay = Nv, a) + 8N (M1) ELX, — Xe [Xe— Xe 

+N(N-1)(N — 2). 4[X,- Xl [X2- Xl [4s - Xo! 
Wm Nou, + ANN — 1) BUX Xen’ a Xen 


+ ce 1) LX, — Xwwy}" [X2- Xo ]” 


+6N (N —1)(N— 2) B[X, — Xe)" [Xe — Xe [Xs— Xl 


+N(N-1) (NV - ave 3) a — X wy" [X.— Xi) ]" [X3 — X wy ]" [Xa - 


X yy |". 


Determining we and we ee from these relations and substituting the 


a 


values found in (10), we obtain 'E [vw — Yew) and £ [v’,, «wy — vp, an] 


expression exhibits no special difficulties, but is so unwieldy that I do not 


give them here, contenting myself with the deduction of £[v’,, (w) — y», 


E [v’,,«v) — Yr, xy |t which will be shown below (see Chapter IV, § mt). — 


Il 
(1) Noting that 


cy) > and 


N N 3 
S [X¥;-X (iy P= = eee —m,) “Vl £an —m P= = Kimmy FE (X;- m) |. 


1=1 t= 


and putting 


Ty Ns 
(X;- m)| | > (X;- m)| 
i=1 


rif 8 
Xin Xu} | [= i m) | 


Z s) 
a (nN) — 


Oe s) r 
a 
we find 
Gn) ‘a ~ m 
Wea -E| = ee 


(m) ee h 1 Gt aa h, 2h) 1" Nm 
— Vi, (N) + ee (- 1) Nk m 2, (N) +(- ) 2m, ( N) 
= 


(m) ag im, 
Vom = Al > (X;—m} | 
a i=1 
(m) moi h (m-—h, 2h) 
=W, , (N) ae CG, as (N) + N™ M2m, (N) 


(m) (m) ee h 1 
7 Va s ae 
or We m= sy ee Ce Wh 


(m—h, 2h) 
4, (Y) 


and, on the other hand, 


ys) St Qh 1 po-hs+2h) e 
Us. (NY) i C. Ni Z,. (N) ove * Popts,(N)) 


(r, 8) "Ss (rh, Ca Brit 
amy (N) = 2 ve 1)! Cr ait (N) + (— 1) NTF poras, (NY » 
l= 


—N™ Mom,(N) srreeeeee 


| 
won 
| 


Their 
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whence: 


os Oh ry aan + (yt OE | + We stores, an [1+ (= 1Y]=0...(07). 


(2) Noting that 
N r N 8 
Oy | = mye > (X;—m)| [X= m)+ = (Xi—m)| 
i=2 i=2 


2, (N) 


Sal. . 
= Ports + = C; Poprs—j (NM — 1) py, (—1) + Mar (N — 1) ps, w-1) 
fe 


TaD Ai (h) eee J (h, j) 
ap > Cc. Mor+s--2h Vs ,(N— 1) + > > Ge C Mor+s—2h—j Ue (N-1) 

h=1 h=1 j=1 

i. ue (r) s (7,3) (7, 8) 
+= Gs Moar—2h us ya Ms Ves (N- yt RS C Bs-j Us (N-1) + Ue , (N- 1)’ 
h=1 ~ j=l 


we find : 
(7, 8) (7, 8 J a i h 7 (h) 
Usa — Ug ix = ECF tenon OF —1Y ty, v1) + Or Maresh Vaw-p 
j=0 i= 
Se : ‘ r—-1 
= h j (h, j) h (h,s) 
« a = CO, Morssonj Uy "yy + a C,, Meraat Us waa) 
b=1 j= n= 


and hence: 


gy” Ss) 2 C = N j > a s y 
2, (N) — Mor+s—j = ( —f). Bj,(N-f) ae ee r Mor+s—2h = 2,(N—f) | 


Sys (h, 8) Soh had te) 
+ = Citeeean S Us n-. pt 2 SSO, CO) borzsonji & Uo w_py 
h=1 0 =1 j= 1 fal 


ane here (see Chapter I (14) and Chapter IT (10)) 


Fe | Ent. (4)-1 oe (2)-#+1] 
i NF) np > j Tj, unt. (£)-i 
fo Sy () Sc 
9) 
Sy $ NEON 


TO Dies, oped 2087? 


and noting that 


(0, 8) 
To an = N* Ms, 


u” 0) = y 


RCN) ae. 2)? 


1, S h 
Une = W luoiet 3 CO, Bevan (NV — 1) pao} 


2,(N) 


=Nisyet & NEV 3, <OF Peron La, j, 


192 Expectation of Moments of Frequency Distributions 
we find in consequence : | 
Tee a es 
Ue? = Nut NO pe 


(1, 8) am 
U. (Nv) — Nps + 47 ps pe, 


U8 an = Note + TNA pag py +4 NO pg? + BNT pa 


(a, 6) & : 
On on = Nu,+ NI [16 p65 bo + 26s fs + 15,7] 
+ NOV 6O py po” + 70 p13? poo] + 15N I p,', 


2,1) = 
Dy = Nas + 2 pps, 
Oi = Ny + 38N* py oe + 2NT pw? + NO He's 


(2, 4) , 
Uy o = Nyt NU [8 p46 be + 125 Ms + Tp] 
+ NOV [165 oo? + 20h? oe] + BNI pos, 
3, 2) = 
Oe xy = Nite + NO [pte to + 6 pts fs + Byte 
+ NEMO pos pes? + Gps? oe] + 8 post 


(3) Substituting the values we have found for Lee in (15) we obtain 
\ 
Woy =(N- 1) pe | 
»- N-1 | 
Wea = ym (ON -1) a + (= 2 48) pt) pe 
H 1 1 
= PS N [py — 3p" | — (24, — 5y?] + WV [oa — 3y']* | 
een ; \ 
Ws ay =p (8-1) + 8 1) Gt - ON $5) pa tee’ 
—2(3N?—6N + 5) we +(N — 2) (N?— 3N? + 9N — 15) p,5} 
= N° pe? +N? [Bpty pa® — 6 poo") + N [pt — 12 jy oo — 6 pla? + 20 p08] | (20), 
| 


1 ‘ 
— [Bp — 80 poy po — 18p;? + 480°] + WV [Bp6 — 36 poy flo — 223? + 632" 


1 
= F73 Loe — LS ps me — 10m,” + 30u,3 | 
wl 6 bs bs yee ; 


1 2) eee , P : 
* The value of NE Want given in Student’s paper ‘‘The Probable Error of a Mean”’ (Biometrika, 


Vol. v1, p. 3), but unfortunately misprints have crept into the formula. 
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(4) N 


ey 
We an = ys {wv - 1) ps + 4 (NV — 1)? (N? — 2N +7) pg oe 


+(3N*—12N* + 49N?— GON +35) p2 
—(N—1)(24.N?—48N + 56) pa; pw, 
+6(N —2)(N4—4N? + 16N?—40N + 35) py we? 

— (N= 2)(24.N* — 120.N* + 280.N — 280) po 
+(N —2)(N —3)(N*—4N* + 18N?— 60N + 105) ps'| 
= N34 jie 4 N3 [Gu fied = 10p.'| 

+N? [4g be + Bye? — 4204 po? — 24,” we + 53,4] (21). 
ats N [ue a PAUTE Pe — lbp? ae 24u., Bat 180p, fe? + 192)? 2 218 u.*] 
— [4g — 6404 flo — 54? — 96; w, + BT 6p, po? + 688 py,,? uw, — 687 1,*] 
+ 5 (6s = 112, Be = 102u2 — 176p; Bs 

+ 1122, po? + 13604," wo — 1398.4] 


1 
~ N2 [4g — 92 pg fo — 95? — 160p; ws 
+ 11104 2? + 1400p? wo — 1515 p,*] 


1 9 
=f W: [Ms aa 28 Ue by — 35 ue = 56us bs + 420.4 po + 560 p,? bo ia 6380p," | 


In the case when the law of distribution of the variable X follows the Gauss- 
Laplace law: 


(@) (1) 
Ws an = (NV -D me = Ve (N=1)' 
Wein = WV DV + Dwi = Vay 


| 
: F (22). 
. : 
Wey =NV—-1I) (N+ DN +8) a2 = Ve wey | 


We ay = (M1) (N+ 1) (N48) (N+ 5) pst= V5 ay” 
3) 


(4) Substituting the values found for Woven liege Wer yy and we oe in 
(10), we find : 


(=p (W-1) (N38 
Ev’, «wy — Y, apres NE , ba — eile We 2) i; 


= [= pe? | — ihe 2ps'] + 75 ee 3 ps" ] 


* Cf. Student, ‘‘The Probable Error of a Mean” (Biometrika, Vol. vt). 
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(N-1¥ 3(N—-1)¥(N-—5) \ 
Pe 


yf « 
E [v's (yy — ¥2, «}' = We fs fey 


_2(N=1)BN*-6N 45), 2(V—1)(N2- 12 + 15) 


ig ve fe 
\ (24%, 


1 en : 1 e ‘ 
~ N2 [os — Spy fe — 6p,” + 2yro*] — Ne [Bps — 21 pay fo — 185° + 26 p10" 


1 9 q 1 9 ry 
+ Ne [85 — BB pry Me — 224.7 + 540°] — Ne [pos — 15 poy oe — 10 pes? + 80m] | 
E / 4 3 919 
Y [v"s, cy — ¥», cx} = aya [Ha — HP 
1 = 5 : 
+ apa [Hs = doe Me — 15 ype? — bps bs + 48 ps oa? + 963" He — BOM") 


1 Z ; . } 
— pile — 40 He pa — Bega — 96 5 + BBC, pat + 528 p42 wa— 806us'] | 


SS ee 


a 
Ni 


— a [4s — 92p5 oo — 95? — 1605 wy + 1110 py pr? + 1400p, wy — 1515 p54] 


+ = [6p — 96 peg fo — 102pu2 — 176 pos fog + 924 peg fo” + 1232u1,? wy, — 1044254] 


Sa eee 


1 Dae 5 
+ 7 [os — 28g be — B5 pu? — 56s fy + 4204 pe? + 5603? We — 630 pro") 


In the case when the law of distribution of values of X follows the Gauss- 
Laplace law : 


E[v’s, 7) —», anh = - Sane s Ho’, 

' Ne oy Chica) cons 
E[v's, «vy — ¥2, a)? = NB be", 

ja 12(N—-1)(N +3 
Elv's, wy vay} = oe Dae 


EY, —% aol _ 3N+3 
{E [v’s, wy — Y2, ay]?}? Nal 


CHAPTER IV 
I 


(1) We may also follow another road, to deduce the formulae obtained above, 
a road nearer to the one usual in English literature. 


Let us denote by 7; the number of times in V experiments the variable X 
takes the value &;, one of its k possible values (cf. above, Chapter I, § I), and putting 


, nj 
Pj =a 
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k 
we have: No = in, 
j=1 
e / 
= pj =1, 
j=l 
k je Be, 
Mm => pb; m= = pF, 
j=l j=l 
k m ; 
SI n q . / — q 2 4 
Pe = 2 pi [G— ms br = aad [&;— ma)’, 
= I= 


Xx) = & pj & = my. 


But 
En; = Np; | 
En? = Np; + NU p? = N2p? + Np; (1 — pi) | 
En? = Np; + 3NW™! p2+ NW p? | 
= N*pé+3N* p? (1 —pi) + Np: (1 — pi) 1 — 2pi) + (1), 
Ens = Np; + 7N |p? + 6N p3 + NO ps | 
= N*p#+ 6N*p3 (1 —p;) + N? p2 (1 —pi)(7 — 11p,) 
+ Np;(1 — pi) A — 6p; + 6p?) 


) 
and in general, as is not difficult to see*, 
r oe r—h ; ' 
Eng = = NIV Gs 9 Dil ae N* pi 2 Cl a, ngBisy7 Pe os.00: (2). 
si h=1 : =0 


Further, denoting by P) the probability of n; taking the value h, and by 
hae the conditional mathematical expectation of n; on the assumption that n; 
J 
takes the value h, we find: 
| Os pi (1 —p)*-*, 
(h) ais ms Pj 
Ez, =(N h)y — pi’ 
N iy ote! ce ae P 
Enyn; = > PLE = % (N-A)AC, p? A — pi)’ hn, = N(N—1) pip;. 
h=0 ee: 


=0 


Similarly we obtain : 


En, n,n, =N(N—-1)N=2).pi, pin Pig | 
En; 1, ... ui, = N(N—-1)...(N—k+1) py, pi, -.. Diz J 
* See my paper, ‘‘ On the Mathematical Expectation of a Positive Integral Power of the Difference 


between the Frequency and the Probability of an Event,” in the Proceedings of the Petrograd Poly- 
technic Institute. 
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En 2 ny = VI] Pi Pj db Nt) PEP; \ 
= N* p? pj + N°? pip; 1 — 8p;) — Npip; (1 — 2p;) 
En2n? = NU) p,p; + NM! p; p; (pit pj) + NW p? p? | 
= N'p? py + N°pip;( pit pj— Spipj) +N *p; pj (1 — 3p;—3p; +11 p;p;) 
— Np; pj (1 — 2p; — 2p; + 6pip;) 
En?n, = NO lp, pj+3NO! p2 p+ NU pe p; » (4), 
= N*p? pj +3? p? p; (1 — 2p,) 
+ N? pi pj — 9p; + 1p?) — Npi pj 1 — 6p: + 6p?) 
En? nj ry = NO pipypr + N™ pe? pj pa 
= N+ p? pj pa + N* pip; pr — 6pi) — N? pip; pr (8 — 11 pi) 
+ 2N pip; Pr (1 = Spi) a 


Vo 
manna S&S > Ni-Gt highs 
EAE gs 2p a Metra: es ngs Ore: ba De! Dj teaas oaceuaeet ee (5), 
y= bg = 


and in the general case : 


ry up Vk 
epee rs Sess = (Ity-Higb. thy) uhpiy ks I 
Ein,” 1," Se Pea arcs Wve #1 Wee hy Ores fig o> Org hy Dig Die enue 
i= t= j= 


If the numbers 7,;,, ”;,,... %i,, referred to k series of independent experiments, 
then we should have: 
Eni” nf 0. Mre= Hn” Eng... Eng" 
Tr; 


12 rk = 

a Ly a hy = ‘ 

= SB eee EMO NEM WOM iy ty Oey hg oo Seas te Bid® Pil «-- Bight 
hi=1 ho= hke=1 


(2) Passing from the mathematical expectations of the numbers of repetitions 
to the mathematical expectations of frequencies, we find : 


Epi = pi \ 
Pe en ek ; 
Epi? = pi + pil — pi) | 
Ep? = pe + Spe (1 —p;) + = pi(1 — pi) al — 2pi) 
r (@) 


6 1 
Eps =pi+ wee (P67) Ne pi (1 — pi) (7 — 11p:) 


1 : 
+ 573 Pi (1 — pi) (1 — 6p; + 6p?’) 


SS 


r-1 
Ep;* = = 


— 
h=0 


1 h é 
yn di os (= Oy, nty Ben tf Dil corres (8), 
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ay 1 
Epi pj = PiPj— Hp PiPy 
Ip Y 9 1 . i 
Epi pj = pi pj + 7 Pivi (1 — 3p) — Fe Pi (L — 2pi) 
ig 7 y 3 2 
EDi Pj Pr = Pi Pj Pra— Pi) Pr + ys PiPi Ph 


er eee 1 
Epi? py = pi py t+ ape Dy — 2pi) + 779 Pips CL — Opi + 11p?) 


1 
~ V3 pi pj (1 — 6p; + 6p,?) 


Epi" pj? = pipe + VPP; (pit pj — Spi pi) + x 
x pipj (1 — 3p; — 8p; + 11p; pj) — - pi pj (1 — 2pi — 2p; + 6p; pj) 
Epi pj Pu = Pe Pj Pa + yi i Pj Pa (L — ie )— WR v3 Dis Pa (3 11%) 
ai <pepsPr (1 — 3pi) 


6 11 6 
Epi Pipn Pp = Pi Pj Pr Pp — whi Pi Ph Pp + ig Pi Pj Ph Pr weli Pj Pape 


and hence: 


ae 


E (pi, ed ie = 5p d_- pi) 

E (pi ef ean — pi) A — 2pi) 
aa a 

Ei (pi — pi) = 7a BE (1 — pi)? + 575 Pi (L — pi) (1 — 6p; + 6p?) 
j ; i) 

E (pi — pi) (pj — pi) = - WliPi 
/ 9 : * i 

E (pi — pi) (pj — pj) = - sae (1 — 2pi) 


E (pi — pi) (pj — Bi) (Pr — Pa) = 3s Pi Pj Pa 


, roa 3, 
E (pi — pi)® (p; bag Dj) = — 7a BE Dj (1 — pi) — = pip; (1—-6p;+ 6p,’) 


E (pi — pi)? (pj — pj = Pipi pi — pj + 3p; p;) 


= waPiPi(l — 2p; — 2p; + 6p: pj) 


é 


ae ar 
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(9), 


+ (10). 
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E / 9 / vA 1 © | 
2 (pi — pi)? (Pi — Bi) (Pr’ — Pa) = — 3p Pi Pi Pa (1 — 33) 
2 
+ a75 Pi Pi Pn (1 — 8pi) 


/ i; , / 3 
B (pi — pi) (Py — Bi) (Pu — Pa) (Py — Pp) = Fys Pi Pi Pr Ps 


| 
| 
| 
| 
* | 
) 


6 
~ Fys Pi Pi Pr Pr 
In the general case we find : 
fe 7 Tem 1 eh 
E (pi — pi)" a = ATT = = 1} p, ks NF, (7) %, rrk-f Brakefk ® =); 
acest NI k=0 
f=Ent. (S) 
and hencet 
E(pi —pi)" =1.3.5...Qr s pi (L—p;)" | 
Usk 1 rat a 
+ Pap ( — pi)’ ae [(2r —1)—2p;(1 — p,) (4r + 1)] | 
ae v2 Noe (r = 1) = 2) | 9), 
+ Wr Di (1 - pi) 9790 [(207? — 607? + 381r + 1) | 
— 4p; (1 — p;) (40r8 — 307? — r + 8) 
+ 4p? (1 — p;)? (807° + 1207? + Tr — 21)] 4+. 


— 4p; (1 — p;) (56rt + 427° — 777? — 42r + 9) 


, opt 1 * . ite N 
E (pi a iy" a 1 . 3 . 5 eee (27; + 1) (1 _ 2;) | ye pi (1 — Pi)’ . 
Me Be (r— 
toa pe *1— pr a J(10r2—5r —3)—2p; (1 —p,) (207?+35r+12)| ‘) 
I p,2 .\r—2 is Cie me) ie DQ yeh} Py (13), 
+ yn ep) sous 2) ((28r4 — 8403 — 7? + 1477 = 54) | 


+ 4p2(1 — p; (L12r4 + 5049? + 665r* + 1897 — 72)] +. 


* The expressions for E (p,’—p,)4, E (p,;’ — p,)? (p;’ —p;), and so on, show how dangerous it is to 


ia to higher powers. Depending on the magnitude 


reject, without due qualification, terms containing N 


of p;, E (p,; —p;)* may be either greater or less than Spe (1-p,)’, accordiig as p;(1—p,;) S 4; when p, 


is very small—of order 1/N—the term rejected and the term retained are of the same order. 

+ Cf. my paper, previously cited, ‘‘On the Mathematical Expectation of a Positive Integral Power 
of the Difference between the Frequency and the Probability of an Event.” Both formulae may easily 
be obtained directly from formulae (22) and (23) of Chapter I. 


AL. A. TCHOUPROFF 199 


au é L y 9 \ 
E (pi — pi) = (1 — 2ps) 75 BE (1 — Di + pi — pi) [1 — 1p; + 12p%') 
N N 
, 15 pe aes ; : 
E (pi — pi = 77a pie 1 — pi + wee (1 — pj)? [5 — 26; (1 — p,)] 
+yaPi(l — p;) [1 — 30p; (1 — p;) + 120p2 (1 — p,)"] 4). 


. 2: 105_ ee ae 
E (pi —pi=(1 — 2pi) | Ng pi (l— pi? + 75 pi — pi)? [56 — 462p; (1 — pi) ] 


1 2 . 9 2 
+ 74 Pi (1 — ps) [1 — 60p; (1 — pi) + 3060p? - pI} | 


ao eal 
Replacing NU(o+%)) in (5) by  (—1) Baysngz Nt’, we find after some 
1=0 


transformations : 
mtr, -1 1 


Ep" pf" = Sy 
Di Ps 0) Nf hy 


—h,—-h, Phy Poh 
> = G 1)f- Ons, ry—-hy Ure, re—he sparred vee fr-hy—hy Pi. * P5* : 
(7) 


where the summation for h, extends to all positive integer values from 0 to the 
smaller of the numbers f and r, — 1, and the summation for A, to all integer values 
from 0 to the smaller of the numbers 7r,—1 and f—h,. 


Substituting the values of Hp/—4 pj” in the development of 
ri 1, 
E (pi — pil (pj — pj)" = = 3. (—1)ht4 0,4 C,,2 pb pj Epi) p/m, 


we find after some rather tedious transformations : 
Pi+12-2 1 form-) \ 
Ul / ~ 
E (pi — pi)” (pj — pi = > Ne = 
f=Ent. (nee) : h,=0 


S—h, (or 2-1) 


Ss (—1)f-h-’e Noh pr-by" VY" 
— 
Di "DP; (")) (75) Ari, 71~ha Ary, ro—he me 
h.=0 - (16). 
| Cy Sera Ns sak a a ee 
1 —1 ™-1 
+r,-h,-h.-1 
ogre = a (Le ata ry hy Ore, rahe | 
2 


chy yy tons 
Bry tre—hy—he,f-hy—hg pir pyr 
In the general case we have: 


ts ee 4 Y - 
E (p's — Dis)” (Pig — Pig)!” ++ (Dig — Dig)” } 
Mtr+.. +72 pf orn) f-h (or re “VY fom hy Hy Cr DY 
= > = eS | 
ane > ae 
NTT ANAL WN h,=0 hy=0 h,=0 
i Mit i) 
x (— 1 fam ham hy Bar ee lke Tee gy” He ee a >(17) 
tthe CR re eg (rn) ¥ Oy). a eee | | 
Arp, Th—-hh Br,+.. Are —hy—.. hy, Fahy. — he 
1 rm—-1 r,—-1 r-1 r1—h, r,—h | 
ee > fs (- iat Sd aaa aay pe 


rytret... r= i 
Nr Ey, =0.n,=0 “Iy=0 ” % 


x Te ee Any, re—hh Brit ary —tn= iets Mt AT hy 
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If we agree to put A,,,=1 and C,-1=0, then if »,+7,+...+7, = 2r, the first 
term can be brought to the form (cf. Introduction (28)): 


1 r(or 7,-1) r—h, (or 7-1) r—hy—hy— ...— hy) (or 7, - 1) ; 
ie hy p: - 
eS Se ars S (—1y-¥p pe 
aN h,=0 hy=0 hy=0 (18) 
Qh, hy 2h ‘ 
gi Oa Oi Oe Aig, Ate, * Ai, Br ve 
and when 7, + 72+... + 7% = 27 +1, to the form (cf. Introduction (29)): 
iI r+l(or 7,-1) r+1—A, (or 7-1) r+1—h,—h,—...—he-, (or ry) \ 
a. Sa > (<1)tnE pe 
aN Fhe hy=0 ~ hp=0 : | 
ry—h, hy, 2h 2h 
x p;, ape iY NH. C. 5 OB ry Atay + An, pel oP RY | 
> (19) 
2h, 2h, 2h,-1 py2h; 2h 
+ OF Ot Ang + Ate Breit O,” OC," 0, * Anns | 
x Aj, 0° one 0 ee ote. | 
y2h, as 1 py2he-1 | 
ace C nee C. Ay he, Ae ro Aye, pags 2. rH, 0f | 


Noting that, in sd aghite ann (3), 
m ped , , 1 rYaVeor) 
Eps, Dig ++ Pig = EN ™ Pig Pin ++ Pi = Pi Pig * Diy =  (- De a Bn n (20), 
we find, on the other hand, easily : 


E (De =e) a — Pi,) + Die — Piz) 


ke 1 k-1 h 3 
(— 1) (ell 21 
= Pi, Pig ++ Baa pa Wh V* Bin = Diy Dig ++ Pit > a By, sn—k ep 
=0 k=Ent. @) 

Hence: 
AL / , , 20 94, » 
E (p's, — Dis) (Pig — Pia) +++ (P'ig — Dig) = Pin Pig +++ Pig [ net wh | 
er : 15 130 120 | 
LD) eas (p is — Pig) = Pig ++ Dig \- Wet N34 me vf f (22). 
(210 924 720 
E (P's, — Pi) + (pa - Bi) = Pig +++ Pig Vy ~ “N +o 
II 
(1) Noting that = m,.(v)= BX ns =H | Pj 6] : 
we find 
k k 
Ms, (Ny = E 13 De ee, epi acy é,| 
j=l A=l Gh 
k k 
= > a Ep? + > > Sr &. Ep’ eae 
j=l ts Je 
k 
= a ‘a lpr 3 whi a-p)|+ 3 by Pe? , Sar [Pa Pig — yh p.| 
ie 2 1 k il 
= pe Pj e} + NV rae Pj EP - V 3 > iby = m? i [1% _ (|. 
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In a similar way, the other formulae deduced above may be obtained (Chapter I, 
§ II). 
(2) Noting that 


ke 
My — My, = & (pj — pj) &?, 
v1 
we find: 
E (m’,, — M,y,) (m’,, — M,, \ 
Ee eeetihine a s , Ts 2) 
-E\% > (pi — pb, *+ 2 = (pi, — i) Pin — Pra) §;, ei. | 
das Ai=1 Jo) R Ae 
a ee | ; es ag ae il M25). 
cee ae ae ert eae Sete | ene ty 
j=l 3 Ly a P| Weicia, is Br N Par | 
I 1 


een a, | 
= 7 Merrt rs — Mery Mery} 


Similarly we find: 


E(m’,, — M,,) (1, — Mz.) (10,4 — M5) 
i ( : : f ( 24), 

Site Ge es Lee re Mrstry Mg, + Mpa pry My, | + 2M, My, Mrs) 
Ee (a, — Mp.) (M0 pg — Mg) (1 gg — My) (Mp, — My,) \ 


IN? 


= ae We ir ge ep (Up ap — 1, ee | 
+ ae — My, Mrs] (Mertrg — Mo, My,] 


t [Meyer — Mr, Mrs] [Meoysry — Mery My, ]} 


Sam ae y 4 y 
ai N3 Ur trotrs+ra [itp ep acere Moy + Mr trotra Mrs 


+ May trytry Mr, + Megrrotry Mr, | 


+ Mp trg Megerg + Mri41, Merging + Mp, 47, Moree, } 


and hence, or directly from the formulae of § I: 


? 1k 
EK (m, —m,)P= yy Liter —m,"| 


3(N— 


\ 
E a 17, \3 1 Ks 9 3 | ‘ 
Y (My, — mM, = RE [Msp — 3m, m, + 2m,*| (26). 
, 1 
E(m, — m,)!=— We a My”)? + Ne [sp — 4irdsp My + 3M?,, |} | 
J) 
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1th 


(1) Denoting by o’ the difference X ,y)— 1m, and by dy,’, the difference 
My — My, We find 


k r 
v, = = p; [& — Xu) = = p; =e (— 1)" C2 (& — mr" w* 


| 
=H = C20 peat Ce OF fh pa OO Miges te | (27). 
= bp + dp, — Cp" poy — Cody’, + Ceo”? br-2 | 

) 


a CZo” dp’ ps = Cw" He—3 Cio'dp'ys +... 


W. F. Sheppard, in his well-known investigation “On the Application of the 
Theory of Error to Cases of Normal Distribution and Normal Correlation*,” 
terminates the development at the third term, taking 


Vo = Mp Fp, —TO fig Ay hiceiecs-cecacece se (28). 
Hence : 


Pf 
Ue te = py, 
/ = / =d , ae , 
Vp — Ve = Ve — Bye = Ofy — Vz @ , 


E (ug — vy) = B (0! = pp)? = Eh (dp YP — 20 pp Hood py’ + 17? Ho” 


1 - 2, ee 
= WV (Her — Mr’) — NV Tha Mra + yl 7-1 Me : , 
= oy (Mer — Be? = 20 Mrs Brat Ms o- 


We thus obtain, with full accuracy (cf. Chapter III (13)), the first term of the 


development of #'(v,’ —v,) in powers of af This is explained by the fact that 


N 
: eee 
the terms rejected by Sheppard in the formula (27) do not yield terms of order vy in 
E(V, —V,). Owing to the same circumstance, we also get accurately the first — 


term in the development of £ (1’,, — v;,) (vr, — Vr,), Starting with (28): 


Ev", eat Vy) C= ious ) = E (os ca Mr) Ce _ borg) y 


=F [dus TT bry wo | [dp's, — 12 Mr—1 o'| 


1 


(29). 
NV (Pretes May Pry 1 Pry Pag 12 Prt Merge F 1% 2 Bry Mre-1 a 


* Phil. Trans. A, Vol. 192. 
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But we cannot start with (28) in the calculation of the mathematical expec- 
tation of (y,’ — v,), Just as we cannot obtain the further terms in the development 
of E(v,’—v,)? in powers of 1/N. For these purposes Sheppard’s method must 
be put into a slightly changed form: more than three terms must be retained in 
(27). In the calculation of the terms of order 1/N?, we must at the same time 
rely on the formulae of § III of the second Chapter and on the following relations 
easily deduced from them : 


need) , 1 
Ko diy ame NV Pra 


Do [onsen (30), 
Ho * = WN Be ] 
, / VA 1 \ 
Ew du ry du 2 Ng [Mritron 7 Pry Br. Pry Mr, | 
Bho 1 | 
oo"? d py = We [Mrse — Mr Me] be mccin Sabor (31), 
1 
Ko’ = a7, KB: 
Ne"* j 
\ 


y / , , 1 
Ew’ du 7 UM ry Up'y, = WN? [oryta Mates + Maret Meyers + Brgti Mr, 


T Pay Pre Mrs 7 Pry Mreti Pers 7 Bry bre Prt] | 
1 
Ip NB [Mrtretrett ioe (Mrytry+1 Brg + Mrytrget bry + Prgtrgtl re) | 


faa (Mrytrs Mrs+1 a5 Prytrs Prot mF Pry+i Mrstrg) 


+2 (Mrs Mrs Mrs + Pry Prot Mrs + bry Mee Mrg41) | 
jv "9 / / I ¢ ° 
ody ry dey, = WN? [ery try My + Ura Mryti — Mery Mery po] 32). 


1 
ar N3 oraress a (Mr te Pre te Bry Mry+2) 


| ( 
— (Cry trg Ho + 2pryss Mrs) + Zor, Mr, Me] | 
ip 3 I 
Eo* dp, = Ne Mert Be aF W3 [Mrs — Mr Ms — Borys Me] | 
J 


: > ae | : 
Kw" = ye be a Ww: [Hs — 3p."] 


72 


(2) To get the exact value of terms of order 1/N? in the development of 
E07, — br.) (Ur, — ry) IN powers of 1/N, it is necessary to start from 


tf Ve yf  f / if dle <—_ 1) 19 
Ve — by = (py — Ppp, @) — (re dp’ — set My—2 @ ? 


PT ee, r(r—l1)(r—-2 ” 
+(” 5) 2 dag OE) og ct). 


14—2 
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After some simple transformations we find: 


AW fal : ' 1 \ 
rp, (v Tite Hr) (v yee Mrs) = N {Prytrs ae Br, Pr area Pry Mri mee lae Pry Mr-1 | 
FT Ys Pry Pri be} 


1 (1+) (1+12—1) 
ar N2 \- (71 + 12) Mrytr, + nl D) ~ Prytre—2 Me 
2 Maryse Prg—2 + ee Mry—2 Prg+2 


1 [-3] 


ree o(7: ‘laa ail) ry 1 i ee 
+7 +12) Myr Mra FP (11 + 12) Mya Bret — “Mt Mro~3 Me 


(33). 
= tr bry Prot By, ty (7, ste Y2 alg 2ry7 2) Bry Pr, 


Ty ee 


1 (7 —1) 
= ner) (3r, + 2) Pr, Pr —2 be — 


1) a 
(Br, + 2) bry» Mor Me | 
—3nr, (7+ 12) Mri Pr Me — 41 (od te 1) Pry bre—2 M3 
aa ee (7 — 1) 12 Pro Marga a + en" rfl Pry Pre—3 pe + tr r, Mry—3 Pra He” 


+ 3n(m-I rir 1) fry -2 Brg—2 ue Slaels | 
Noting that 
i Ce = Vr,) (x, ay a) jal E(v',, a i) (es ae bry) _ (vy, =; br) (Vr, - ee) 


j ) 1 
=H 1 Pr) (Vr, — Hr) — ig Lit. Pry Mr, — 21102 (Ge) Pry Pre—2 Me 


—4r, Gee) Mr —2 Pry M2 + ree (ee Wie Cay 1) Mr —2 Pr--2 ue] "OG 


we find hence : 
7Y / / 1 \ 
E (v 1 Vy) (v tT. Vy.) = N ereers —12 Pry Prg-1 rami Pry Myo+i | 


= bry Pry F172 Pry Pra fo} 


I (7, +72) (+72, —-1 
+ ie \- (71+ 12) Mrytr, + (n 2 ve!) Mry+r2—2 M2 


+ $12 (12 — 1) Prt Pergo t $11 (7 — 1) Mra Prete 

+1 (Ti+ 1) Mrysa Pry F 11 (M1 + 12) Pry Prt. — E71) Mgt Mry—s Me | 

Sala! bry-3 Prt fot (i +Tet+ 112) Mr, Pre 

= +1)a,¢,>1) ee Py,—2 ha 11 (7, — 1) (2+ 1) Mry-2 Mery Me | 
Bi. (Ty + 12) Mra Mri He — 2 BMT. (1% — 1) Myr Mrg—2 Ms 


(34). 


Bea WW (ert + Be Alek 2 
ani” 1) 13 Mry-2 Mpa Hes + Err Pry-1 Pro—3 pe +47! 11s bry—s Pry-1 2 


+47 (71-1) 72 (72-1) Mee Mrye ust te cies | 
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Putting 7, = 7,=7 in this, we find (ef. (13) of Chapter ITT): 


7 2 1 9 9 } 

Ev, — vy, = WV (Mop — 20 pps pa — fl? + 7? pra fa} | 

: | 
Hapa (= 20 Mor Fo (20 = 1) pope Ho + PP — 1) Mrgs Mone + (35). 


+ ir? pgs ra — TO) ppg fps Me 
+7 (7 + 2) pb = 20 (7° = 1) pp Mpa Me — 89° pp flo — 7? (7 — 1) fra fro Bs | 
+79 (7 —1) (7 — 2) ppg Myais a? + 30° (7 — 1)? ppna pee} + .s- 
(3) To find accurately the first term (of order 1/N*) in the development of 
TE Oe, = fer) Y rg = Pag) vg = Pra) 
we must start with 


, , , , hh r(r—1 ie 
Vy — Py = (Apr =r fy") = (re Opt p — 9 =) jty-a00). 


Using the relations (30), (31) and (32), we find without difficulty : 


Ki, cy Hr,) (V'r, =a oe) (Vp, oa. ) 


1 
7 Ne Jensen a [Hrtre Mrs + Mrytrs Pro + ry Pro+ry| a5 2Mr, rz rs 


Taz ["; Bry (Mrytrst1 ~ Prov brs — Pre Lrs+1) Fy by (Mrytrgtt ~ Pry brs 
— Bry Mryti) +3 Mri (Mrytrgt — Pry Pry, 7 Pry Mro+1) | 


+ P72 Mr Mir (Hg 42 — Pry Po) + T47"3 Pry Mrg-1 (Mirgt2 — Herp He) 
+ 127s Pry Parga (Mryg2 — Pry Me)| = 1172's Mya erga Mery Bs | 

— Us (Mast Bretng aH Mires Prze rg + Pry Myre — Peryta Pre Mery 
= Mery Prat Perg—a — Mery Mery Mrs) 


ar Ue) (Mrs Pro+r3-1 a0 Brg Prytre—-1 2 Pr. Mry+r3 in Pry Pry brs 
= Pry Marga ryt — Pry Mery Pry) 


Ey (Maser Mrytrea + Meret Mrytry—a + Mery Mrgtrs — Bry Berg Brgts 

; = flry—1 Pret Pry — Pr, berg Hers) | 
FPS Mra (Mrstrg—a Ma + fora ir erg — Pir Pry—1 Pe) 

+ oP 5 Prs—a (Mrytrg—1 Pa + 2Mry ts brs — Pry Merg—i Me) + (36). 


FP Ts bry (Marg pry—a My + 2 pry Margi — Mrg—1 Pry He) 
F123 Mga (Mery trea Mo + 2 pry borg — Pry Pry—i Mo) 

ETT: My Mary try Pe + fly Merge — Pry Pirg M2) 
FT 3 arg a (Par pry 1 Mo + pry Mrgta — Pry Pry M2) | 


— [87 rer; (Mry—1 Mers—1 Mrs Ma Mra Pre Mrs—1 Pe + br, Pro rs Hy) | 
+ [E13 (7s — 1) pyyo (Mryery Mat pr yaa Pret — Pry Pre Hs) 

FST (%2— 1) Mya (Mrysirg Ma + Qtr 41 Mga — Mr, bry M2) 

+ 511M = 1) pa Meets Mo + Derg ts Mrgta — Pers Hr, He) 
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— 3 [drirs (%3— 1) bra Merger Berge Ha + $7273 (13 — 1) Pry Pr. Pr3—2 Ke | 
ar r T2 (7 7 1) Pry Pre-2 Pr3ti M2 ats a1 ("5 aa 1) r, May Mry—2 Prs—1 Ke 
+én(m—-1)” Mr,—2 Mrg—1 Mgt Hot 31% (1 — I) 75 Br —2 Past Mr3-1 He] 


+ 3[$rrers(7s — 1) pra Pere Prs-2 Me + $1717 (%2— 1) | 
X 13 Mra Pergo Pre He + OP) (CRS nae Mry—2 Pry—1 Mr3-1 ra +... | 
Noting that 
EE (Up, — Yq) (U'rg — Yrg) (Y'rg — Yrg) = EY ry = bry) (Org — ber) (V'rg — Mg) 
— {(Yp, = pry) EB (0'rg — bre) (U'rg — bra) + (Ung — Mra) 


x E (COPS = re) an ae Mrs) aF (oy = brs) BK OT a ) (DG. oat H+,)} 
+2 (Cs a Mr) (op, ral ry) (Vr, ai brs) 


(37), 
X [reves — Pre Pry — V2 Prg—t rg — 1s Mire Mrs + T27°3 Mry—1 Prg—1 M2] 
+ [12 Mr — $72 (%2 — 1) fyg—o Me] [Mears — Mr, Pry — 12 Pry Brg 
13 Pyysa Mga + 117s Mp, Pry He] 
+([r; brs — $73 (73-1) Mry—2 po | [Hotes 7 Pry Pry 7 17) Pry Pret 
= 1 ory Mg + 112 Mri Mry—1 Mel} +... 
we find hence the first term in the development of 


E Ga a Vr) (Chies, i Vee) (vr, Si, Vrs) 


\ 
| 
=i in) (1, — Pra) (55 airs a (lr Mr, — $71 (7% — 1) Hr,—2 Me] | 
| 
| 
) 


in powers of 1/N. 


Putting 7; = 7, =7;=7, we find: 
, i ; : 
yr Hy) al Ww {Har — 39 for Bra — 8 (7 +1) Ber by +3r(r — 1) poy Mrs Pa 


= GF Popa Pergr + 69? papa Mya fo + 37? Ppgo Mpa + Q0 (+1) Mpg fr Mra (38) 
+86 (7 — 1) pegs fre — 992 (7 — 1) ppg pa bya Mle 
+ (3r at 2) (OR = ar (r ar 1) ber” My—2 be 


— 97? ET) Me Bip Bo 8 pa Ms $7? (PD) pep Mpa Me} + oe 
¥ r / 5 3 
E (ug — vy) = EB (vg — fy) + NM: [rp —4r (7 — 1) pp fe] 


X [Mor — 2P fra Mya — ee +e? Mra fa] +. | 
1 
7 WN? {Mor — 8F Parsi Mra — Slop bp — OF Mop Mrz | (39) 
+69 (7 + 2) frga Me Mra + 87 (7 — 1) wy bye 
— 6r? (7 = 1) Myr Mra Mra Mo + 2p? 
— 3r? (2r a 3) Br pa fe — 79 pa Ps + 3r° (r —1) ra Pyr-2 p."} +... ) 


r 
+ 67? plop Mra Me + 89? ppye pa | 
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(4) The first term (of order 1/N*) in the development of 
E(u’, 5 br,) (Cae = bry) Cie = fu) (Ce a br,) 
in powers of 1/N is obtained exactly from (24). We find: 
E Ge ag br,) (Ye ba b+.) Oz, a brs) Ca aa br) 


\ 
| 
ar WN liner ar Bry eA Deretrs a brs je ar [eepecers age Bry Ts) reer am Hrs, | | 


ar [Prins — Bry Lr, | [Mro+rrs — Pre brs] 


= [Ty hry Pret Mestre F Pry Pre tiry + ryt Meets — Meret Pry Mery | 
— Pre Prati kry — Pre brs Prt) 


He Mry—a (Maryn Barges + Pry Prytry + Moret Mrytrs — Bryti Pers ery 
— Bry Pr Bry — Pry Pry Prgti) 

+ 1s flrg—a Maya Mretre + Prats Mryery + Bren Payers — Pry Pry Pre 
= bry Mery Mery — Mery Mery Brat) | 
Hs Pog (Met retry + Pret Prytrs + Mrs Pree — Prytt Pre Peg . (40). 
= Pry Berga Miry — Pry rg ryt) | 

+[rire Pera Pera (Margery Ba + 2Mrgsr Morya — Mery Pry He) 

FLT 3 Mery Mga Merry Me + 2biry ta ryt — Pry Mery M2) 


FT Parga Mergaa (Margy Ho + pry a Margi — Mere Mery Pe) 
F273 Perea Pers (Pry tre Me F 2p Marya — bry Mery M2) 
F127 5 Parga Marga (ry srg Be + 2s Pry — Pry Perg M2) | 
FST Pag Morya (Mary try Pa + 2M ryt Pret — bry Pry Po) | | 
— 8 [7 172% 3 Mry—i Pry—i Pry Mryt BeF M124 ry Pra Mery Mry-1 Me | 
FPS 4 Mrmr Prt Mrg—a Mra Ma F234 Mery Mrgi Pry Pry fe] | 
+ B72 374 fry —1 Pry —a Herg—1 Pry He ]} +... 
Noting that 
EE Up, — Vp) (Org — Yr) (W'rg — Yrg) Org — Pre) 
= EB (V'p, = Bary) (Ur = Berg) (rg = berg) (Ung — Pera) 
Oy te) Bs tO IO ee teed 
; Ug re) Uy [he De [lg CU oy — fixe) 
+ (pg — Pry) EE (V'ny — Bry) Org — Per) O'rg — Ber) 
+ (Peg — Pry) Evy = fry) rg = Bere) (rg — brs) } 
+ (Ye, — fry) (Yrs — Bry) E (rg — Pers) ng — Bry) 
+ (Vig = bry) rg — Berg) B (Ung = berg) (Ung = bers) 
siz (vy, — Pr,) (Vr, — r,) E CH — fry) Ce aa berg) 
+r, = pera) Wry — berg) 2 U'ey = Bey) Ug — Be) 
+ (Pp, — fre) (rg — Berg) E oy — por) Org — Her) 
+ (rg = Hrs) rg — Berg) 2" = bry) (Ur — Here) 
= 3 (Vp, = pry) (Ura Mra) Yrg — Mirs) rg — berg) 
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we see that the terms of order 1/N? in the developments of 
i O., i i) (ie. 3 fire) (he a fry) (15, os ty) 


and Ee (vp, — Up )U pe Vp) ig, Ve) VG — I) 


coincide. 


Putting 7, =7,.=7;=7=7, we find: 

yi / 1 « 9]9 6 ; 
wD) (v, a v,)* a WN? {3 [Har ms Pek ae 12r -r-i [ or Pra 7 Pro eel 

+ 69? pep [or He + 2p ngs = Per? Ho] — 120% pong Wea Pe + B74 why 3 po} +o. \ (41). 

3 9 9 9 9 2 
= We [Mon — Par — 29 peygs fra +7? wpa PoP + «-- 
The same formula (41) gives also the first term (of order 1/N*) in the develop- 
ment of H' (v,' — p,)'. 
(5) In the general case if we agree to denote 
Veg) (UY ope rs) oe Yee re) OY es 


and (o'r, — Pry) ("rg oi br) set (V'n, a 7) by 8p, 


we have: 


: ae i , op 
Edy = KdOy — & (Vr, — bry) Dae. xe 
hel V rn — Brn 6 
+> (Urn, os Pry,) (Yr, — Pry,) B | a) Ge Hives 
h=1 hyg=h, +1 3 (y th, AT tin remains 


We see that, in the developments of Hd™y and Hé*y in powers of 1/N, 
the first terms (of order 1/N*) coincide, since 


g 6p 
> 7 
— (re, = ay) E , 
n=l U th erp 
; 1 : oi 
contains no terms of order Ne On the contrary, in the developments of Hd?) y 
and #é"'* » the first terms (of order 1/N**) are different, for 
t 6 Git) yp 
7 
> (Un, ~ Pr,) LS = 
i= 1 LE tea 
contains terms of order 7 
One SEs eens WNT 


Formulae (18) and (19) of Chapter IT permit us to calculate Hy and 
Hd" y in general, to an arbitrary degree of accuracy, in the same way as Hd"y, 
Ed®v, Ed v were found above. The actual expression, however, is of so un- 
wieldy a character that I shall limit myself to the calculation of the first term in 
the development of /'(v,’ — v,)*, coinciding with the first term in the develop- 
ment of  (v,’ —p,)*. 

In the calculation of the first term of the development of # (v,'—j,)” we may 
take 
Ve — Pe = du,’ mem ao, 
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and limit ourselves to the calculation of the terms of order 1/N*‘ in the development 
27 
E (dp, — rp, o' ! = & (— 1) Ci, 75 pi, E (du, YI 04 .0...... (43). 
j=0 
In formula (19) of the second Chapter put 7; =r, =... =7;=1, and 


UGE | 1 iy eee =H y= 5 . 


we find for 7=0: 
B (dpeg P= pV. B 5 oo. QA) [pap — pe] one 


for 7 — ll c 
Roane a eee Z 
E (dp, ) ae oO = Ni oe Oban (21 — NY tary Fes —p,’]! AE de (44), 
When j = 2h: 

E (C/T wa wl 

: ‘S : , 20 ast —2h—-2] l (or h) 

Ses oS 18d. (2) = 2h — DE —1) Ce y 

aN Ps Da yy af (} ) Rae 7 


Ie Oe Ole 2e— 1). 13-5 2.20 — 2h — 1) vet ORY 2 iy 
= vi Se ae 
1=0. f=0 
27 [4a —Ah]-Ahl j_-¢ of  -Mh—-na 
(—f)iQfyi or Pry h, Be 
Ni ag 
h(ori-h) 2°9 ines hi-a [i —h Fp, oe 
x >: —— op — fe eel eet eae 
g=0 (29)! [Har — Hr] ) 
When j= 2h +1: 
vi (Gf Near @ ti 


\ 
af =~ ¢ Q = 5 ¢ l-f of h—-f 
x OF [Qh]-°1.8.5... (Q0-2f—1).1.3.5... (2h 2f-1) py a | 


J i-h-1 21-41 2i-2h-21-2 


=a 2 (11.8.5... Qi 2-2-3) OS, 


\ 
| 
Etoun) ft ets , ; | 
fae a (Qh e Ne rLh 865 e227 Val .B25...(2h—2f—1))| 


0} oar 
. t— fe ole lis 
to, Peay fo 


1.3.5...(2h+1).1.3.5...(Q0—-2h-1) HEP EM ( 
N' i=0 f=0 
27 [a —h—1-AhCA yp optt 2i-2n—v-91 n= 
Cereus a i 
Wes 2h gal) 153 be..(4— k=) 
Ni 
he(or i= h=1) 2 pO AM [i —h — VM 4 


x = 7 ed pee 2 i-h-1-g 4 ae 
2 (2g E 1) ! [Mor Lr ] 


\. (46). 


1 eg 


—————— 
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Substituting in (43) we find, after suitable transformations : 


173.5 22(26—1) 


E (vy oa Dy) a Wi [peop lig ee Te, rel 2r Bra Heel Sige (47). 
: E ni pee 
Noting that ———7— [E@,—r,y] tends with increasing N to the limit 1.3.5...(27—1), 
Vp — Vy 


[# Ca = vey 
[E (v,’ —v,)? 2t-+1 
that the law of distribution of the values of V,’ tends with increasing N to the 
Gauss-Laplace law. 


and the ratio tends with increasing WV to the limit zero, we see 


lies ee 


E (us ,; tends with 


Comparing (47) with (14) of Chapter II, we see thi 
increasing WV to the limit 1. 


In the case in which the law of distribution of the values of the variable X 

i (v' or oe 
: SEG or Por)? ’ 
tends with increasing N to the limit 1, for every positive integer r. But 
BY ots Pata tends even for a Gaussian distribution of the values of XY to 
E(w or-+l Porsi)” 
the limit ditferent from unity: 


follows the Gauss-Laplace law and po. = 0, for r= 0, 1, 2, 3,... 


VB 256 (27 1) 
~ (Qr +3) (2 aD). . (47 +1)" 


Corrigenda to Part I, Biometrika, xu, pp. 140—169. 


p. 142, Eqn (2) for az; NI-*) read a,,;NI- 4. 
p. 142, Eqn (4) for A‘0* read A‘Or. 

p. 147, 1. 19 for gy41 read gx_1- 

p. 151, last line Eqn (11) for m,.(w) read my (x). 
p- 156, Eqn ( mete fp—1 Pea py_j. 


p. 157, 1.6 for = cep) read v, = (X;) — my. 


p. 157, nas re Proc. Imp. Renae Had Mém. Acad. and under Chebysheff refer to t. 11, 
p. 478, especially of ae ae edition of his works. 
p. 160, 1. 8 for Me (ny 7 Ms Se 


p- 160, last Eqn of (11) for D”® read Dae 


r, (m) r, m—2° 


ud v! 


p. 162, throughout this section y of author’s MS. has been printed YX. 


ple nied) 
72 


p. 163, last line of Eqn (15) insert ml—*! after ‘, and in 6th line of Eqn (15) for ml-?] 


= read mt~ 3], 


after ~ a5 
p. 167, 8th line from bottom of page for En; read Hai dp. 
p- 167, 2nd line from bottom of page for Ay'"fv "fy °"ht-2%-3) read aa a 
oJ 
p. 168, 2nd line from bottom of page for A") read again K eg 


AN EXPLANATION OF DEVIATIONS FROM POISSON’S 
LAW IN PRACTICE. 


By “STUDENT.” 


In her paper on the Poisson Law of small numbers, Biometrika, X, p. 36 et seq. 
Miss Whitaker after a very interesting analysis of the various attempts which 
have been made to test Poisson’s Law on actual statistics concludes that “A general 
interpretation based on a very simple conception seems needed for those demo- 
graphic cases, in which the law of small numbers appears far more often to 
correspond to a negative than to a positive binomial.” 


The following is an attempt to explore the general question of what effect 
various departures from the conditions which lead to Poisson’s Law have on the 
resulting statistics, and especially which conditions lead to positive and which to 
negative binomials when the exponential might at first sight be expected. 


Poisson’s Law has been applied to the occurrence of different numbers of 
individuals in divisions of space or time: thus of yeast cells in squares of a 
haemacytometer, of deaths from the kick of a horse in Prussian Army Corps which 
may be taken as individuals occurring in divisions of space, or of suicides of 
children per year in Prussia which are individuals occurring in divisions of time. 
In such cases it has been asserted that if the chance of an individual being found 
in a given division is so small that when multiplied by the very large number of 
individuals the product is still a reasonably small number, then the frequency of 
divisions containing 0, 1, 2...7 individuals will be given by the terms of the 
exponential 

m2 r 
Neo I 1h SEA aa 2 ec ; 
\ 2 I" 
where V is the number of divisions and m the mean number of individuals 
occurring in a division. 

For the above to be true it is necessary 

(1) That the chance of falling in a division is the same for each individual. 

(2) That the chance of an individual falling in it is the same for each 
division. 

(3) That the fact that an individual has fallen in a division does not affect 
the chance of other individuals falling therein. 
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As to these three conditions (1) is seldom or never true. I propose to show 
that this is generally unimportant; unless the chances of some individuals falling 
In a particular division are relatively high the Poisson law holds; the tendency 
however is towards a positive binomial. 

Next (2) is comparatively seldom true except in the case of artificial divisions. 
The result of this, as Pearson has shown, is that a negative binomial fits the 
results better than the exponential. 

Lastly (3) is often untrue. It will be shown that if the presence of an individual 
makes another less likely to fall into a division the positive binomial, but if more 
likely, the negative binomial will fit the figures best. 

We may start from the fact that if the chance of an event happening be q and 
of its not happening p, then the chances of its happening 0, 1, 2, etc. times in 
n trials are given by the terms of the expansion of (p+ q)", viz. 


Dos Deeg: Dim: = CUC, 


As the moment coefficients of this series about the zero end of the range are 
v, = nq, 
VY. = npg + nq? whence p. = 1pq, 
the binomial is completely determined if we know y, and p, for 
2 2 v 
pal q=l-p=1 = andj 
Vv) Vy q VY, — be 
and in particular the binomial is positive (i.e. n and q are positive) if Fe =) and 
Vy; 


negative if fs1, Inthe particular case when f= —1 the binomial becomes the 
Poisson exponential. 
It is therefore unnecessary to deal with higher moments than the second for 


the purpose in hand. 

Let us first consider the result of each individual having a different chance of 
falling in a given division :— 

Let the chances of n individuals falling in a given division be qi, q2, Ys +++ Yn+ 
The chances of their not doing so are therefore (1 —4q,), (1 —q@), (1 - qs)... 1 — qn); 
and the chances that 0, 1, 2... of them will fall in that division are given by the 
various terms of the expansion of 

{(1 — qi) + qm} {C1 = qe) + qo} (C1 — ge) + gg} (eee ) {1 = qn) + An}, 
i.e. by 
(1-4) (1 — q@)... 1a) +8 {1 (1 — q)(..-) (1 Qa) 
+8 {mq — qs) aris (1 er Yn)} Fives +S {41.9295 pan Gr ae Grit) Bon ( — Qn)} +... 
+9293 +++ Yn» 
the term S {qq93— (1 — Gri)... (1 — qn)} giving the chance that exactly r 
individuals will fall in the division. 
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The sum of the above series is clearly unity so that the Ist and 2nd moment 
coefficients about the zero end of the series are given by two series of which the 
rth terms are 

TS (Q192— Yr (1 — Geer) -»- (1 —qn)} and 1°8 (gigs... Ge (1 — Gear) «(1 — Gn) } 
respectively. 

These series may be summed by rearranging them in the ascending order of 
the g products thus: 

Siig (l= 92)(1—¢;)-..(L=9n)} = 8 (1) — 28 (Gigs) + -. + (— 1) r.S (9, 92.+-9,)4 --- 
28 {qi 92(1 -—93) 1 —- qs)+ on qn)} = 2S(11 qe Ng rat ak (mel Som ol —1)8(qa@.. Ur)... 


ee ee re 


iS CCE oo It al ~ ts) 9100 (1 as Qn)} =tS (hq see qt) + tee 


=a 


(= r.[jrol S(c yes 
fe—Ip—t Wass Gy) ass 


Slalal elo oie (6,016 oje/6-6. O18) e600: 86 bbe 0 .0:6.0' 6.0 0118 6.0.8. 0 8°80 0 610: 8 bio 8 6.6 \0e/ 0018 810 € ai 0 ale Sin Sn sis oie 08 06 6.68 0:90 8 Vivieve eee 


GOTO none 07, Cll Gp) sos L1G) = wae ea hea ows ang ar eesg se PS (GiGass Ope cass 


Adding these we get on the left v, and on the right S(q,)+ a number of terms 
of the form r(1 —1)"" S(qiq@--. g,) which accordingly vanish and we get 


v, = S(q). 
Tn a similar manner it can be shown that 
Vo — S (M1) +28 (U2) 


and other moment coefficients about zero can be found in the same way, but we 
are not here concerned with them™*. 


If 7, g? are the mean values of g and g?, obviousl 
q; Y y 


PEO) V1 man eaer ee eee o nn wae Mes ee ae (1), 

and ¥=S (qi) + 28 (qq) = S (qu) + (8S (q)}? - S(g") 
SG MTeg? — Te ats gk ees ostin etn, (2), 
= nq + n2q? — ng? — Uy care erent ve aca (3), 


Le flo = NG — NG? — NO” 
ng 1-g-7") ae. Bere! 
nq ( qd ee ee (4). 


* The moment coefficients are : 
Mg = NPG — Ngf2, 
M3 = NP (DP — q) — 38n (DP — q) gta + 2N gus, 
ba= npg {1 +3 (n— 2) pg} —n {7 +6 (n— 6) pg} quet+12n (Dp - q) gus —6nqug+ Bn? gULr?, 


where 2 etc. are the moment coefficients of the q distribution and p=1 - q. 
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If now the distribution of chances is to be represented by the binomial 
(P+Q)*. Then 


OS ie ng — | — 07/9) 
Vy ng 


2 


(om 
SG ae i oth ne deeds oO ee 5). 
q 7 (5) 


Since the original q’s are the chances of events happening they are always 
positive so that the above expression must be positive and the binomial positive. 

If now we introduce the Poisson condition that ¢ though positive is negligibly 
small (5) becomes in general zero, for o, 1s usually of the same order as g, and in 
that case Poisson’s law holds in spite of the inequality of the original q’s. If 


2 

Og - . . 

however —! is appreciably greater than zero (as in the extreme case 
i) 


h=3@=Gh=--=™r=9 when =" 4): 


the distribution of chances is to be represented by a positive binomial. 

Next we have to consider the effect of disregarding condition (2), namely that 
the chance of an individual falling into it must be the same for each division. 

Let us suppose then that the q’s are all different for each division so that ng is 
also different. 

Then writing m for ng and m, m%, ng? for the means of m, m? and ng taken 
over all the divisions. 


We get from (1) Py = M1 couatigileseatcap tee ae ou renee eee (6), 
from (2) Vy, = ™ +m? — ng 

= MPM opt ig? Bilin: ove eee (7), 

fg = oes = Ng © ie ied sine Maes (8). 


As betore if (P + Q) is the best fitting binomial, 


Q ar be = ng? ag. Cm 
VY; m 

Hence if o,,2 > ng’, which if there is any appreciable variation in m is probable, 
since as explained above nq? is generally negligible, a negative binomial will be 
found to fit better than the exponential. 

Clearly condition (2) is usually not fulfilled in the vital and demographic 
statistics; divisions either of space or time are generally governed by different 

* If we suppose that q does not vary with the individual but that nq (=m) varies with the division, 


the moment-coefficients of the m distribution being written ,,u, then the moment-coefficients of the 
resulting distribution of divisions are as follows : 


oi m + mb2 ’ 


Mg=m+3 m2 + mH3 9 


Mg =m + 3m? + (T+ BM) poe + 6 bs t+ pla. 
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environments which will vary the chances of an individual falling into them, and 
so we may expect that as a rule negative binomials will occur in place of the 
exponential. 

Finally, suppose that the presence of an individual in a division influences 
the chance of other individuals falling in that division. 

Clearly it may do so either by way of increasing the chance or diminishing it. 

If the chance be increased it is clear that we shall get for the same mean 
number of individuals per division a larger number of divisions containing high 
numbers of individuals and a larger number of zero divisions. In other words, for 
the same mean we shall get a larger Standard Deviation, so that p,/v, will be 
greater than 1 and a negative binomial will fit better than the exponential. On 
the other hand, if the chance of other individuals is decreased by the presence 
of one already in a division p,/v, will become less than unity and the best fitting 
binomial will be positive. The first of these two cases includes linking or clumping 
of events or bacteria, the second such a thing as the counting of large cells on a 
haemacytometer whose divisions are comparable in size with them. 

We have now shown that a population which might be expected at first sight 
to follow Poisson’s law 

(1) Will do so if the only deviation from the ideal conditions is that the 
chances of different individuals falling into the same division are not equal, as 
long as these chances are all small. 

(2) If in addition to this the chances of some individuals are large a positive 
binomial will fit the results better than the exponential. 

(8) If the different divisions have different chances of containing individuals, 
as 1s usual, a negative binomial will fit the results better than the exponential, 
except in so far as (2) may interfere. 

(4) If the presence of one individual in a division increases the chance of 
other individuals falling into that division, a negative binomial will fit best, but if 
it decreases the chance a positive binomial. 

Generally speaking (3) is the operating deviation from Poisson’s conditions and 
accordingly most statistics give negative binomials. 

Finally I should lke to point out that the object of my original paper (Biometrika, 
Vol. Vv) was to give the user of the haemacytometer a guide to the error which 
he may expect from its use, and that the net result was that the probable error of 
his count was ‘6745 VN where N was the total number counted* and that if V be 
a reasonably large number tables of the probability integral may be used, otherwise 
the exponential (or better still go on counting). This result is not affected by 
shght deviations from the Poisson law, any more than slight deviations from the 
normal law affect our use of the probability integral tables. 


* Biometrika, Vol. v, p. 355. The probable error of mean is -67454/m/M where m is the mean and 
M the number of unit areas counted. If in this we put M=1, then m=N and the total count is 


N+:6745\/N as above. 
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(1) Introduction. 

The object of the present paper is to inquire what is the proper method of 
examining psychophysical curves as to their goodness of fit. In psychophysics 
various mathematical processes are employed for fitting theoretical curves of 
“ogive” form* (known to psychologists as psychometric functions, but really error 
functions), to data of a certain kind, usually threshold measurements collected by 
the “ Method of Right and Wrong Cases.” The best known of these mathematical 
processes 1s the Miiller “Constant Process,” using the probability integralt. To 
make the material in which we are about to work understandable, it is necessary 
first to go into some detail as to the nature of the experiments which supply 
the data to be fitted, and as to the theories which have led to such mathematical 
curves being drawn through these data. 

Most of the experiments in question have for their object the determination of 
the conditions of our experiences of equality and difference. For example, suppose 
we compare two weights, one of which is 100 grams, by lifting them in succession 
by the right hand with a number of experimental precautions, into which we need 

* The term in this connexion is Galton’s. 

+ G. T. Fechner, Elemente der Psychophysik, 1860; G. E. Miiller, ‘‘ Ueber die Maassbestimmungen 
des Ortsinnes der Haut mittels der Methode der richtigeu und falschen Faille,” Pfliigers Archiv fiir die 
ges. Physiologie, 1879, x1x, pp. 191—235, especially par. 5 et seq.; G. KH. Miller, Die Gesichtspunkte und 
die Thatsachen der Psychophysischen Methodik, Wiesbaden, 1904, par. 11; F. M. Urban, ‘“‘ Die Psycho-. 
physischen Massmethoden,” Archiv fiir die ges. Psychologie, 1909, xv, p. 287; G. H. Thomson, ‘‘ The 


Accuracy of the ¢(y) Process,” Brit. Journal of Psychol., 1914, vu, p. 46, and in various text-books, 
e.g. Titchener’s Experimental Psychology, and W. Brown’s Essentials of Mental Measurement. 
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not here enter. We wish to know under what conditions the unknown weight 
appears lighter than, equal to, or heavier than the standard weight. An important 
condition is of course the “actual” weight of the unknown weight, as measured in 
the usual manner. But this is by no means the only important condition, The 
order in which the weights are lifted (whether standard first or unknown first); the 
number of categories into which our judgments have to be classified; the order of 
succession of the several unknown weights, whether rising or falling or at random ; 
the range over which the succession of unknown weights stretches, whether or no 
it contains any which are quite easily distinguished from the standard; all these 
and many other conditions are of great importance. Steps can however be taken 
to eliminate some of these factors, by means of judicious experimental precautions, 
and the attempt can be made to keep the others as constant as possible during 
a series of trials. The judgments which are given by the subjects then depend 
mainly on the difference between the standard stimulus and the variable stimulus , 
in the case of our example on the difference between the standard weight and the 
variable weight. Among other points of importance in the fitting of the curves is 
the possibility of deciding by means of the goodness of fit whether the experimental 
conditions have really been kept as constant as has been hoped, for lack of constancy 
in this respect will lead to heterogeneity which will show itself by the necessity of 
using a compound curve to obtain a good fit. 

To fix ideas, it is desirable at this point to have an actual set of data to refer to. 
In some very carefully conducted experiments on weight-lifting, Professor F. M.Urban 
(op. cit.) found that, with one of his subjects, under certain experimental conditions, 
the standard weight being 100 grains, the following numbers of answers heavier 
were returned, out of 300 trials with each of the several unknown weights. It 
should be mentioned that the experimental method used involved that the unknown 
weights were presented to the subject in random sequence, accompanied each by 
the standard, so that the 300 trials referred to were not one after the other, but 
were separated from each other-by trials with the other unknowns. Otherwise 
expectation and other psychological factors producea considerable correlation between 
one judgment and the next, which is reduced to a minimum by Urban’s procedure. 
Moreover, precautions against fatigue and several other factors were taken. For the 
details the reader is referred to Urban’s memoir, with the warning that much of 
the mathematical part thereof is incorrect. 


| - coon : ae | 


Grams s Answers heavier Proportion p | 
| 
ae 7 out of 300 ‘0233 
88 8 out of 300 0267 
92 35 out of 300 "1167 
96 | — 107 out of 300 3567 | 
100 | 183 out of 300 “6100 
104 265 out of 300 "8833 
108 | 279 out of 300 _ +9800 


It is to numbers such as these that the curves to be considered are fitted. 
Biometrika x11 15 
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Any suitable curve which happened to occur to one might of course be employed. 
For example, a parabola of higher order can be used, and the curve tan~ @ has also 
been used. But clearly the whole experiment suggests that an error function of, 
some sort is wanted, and as early as 1860 G. 'T. Fechner (op. cit.) suggested that 
such numbers formed the integral of a normal or Gaussian curve. One usual 
argument is somewhat as follows, using for clearness terms applying directly to the 
above example. 

The existence of a hypothetical point is postulated, called the limen or threshold 
for the judgment heavier, such that above this point the subject always returns the 
answer heavier, and below it he always returns some other answer, not heavier. 
But this limen is supposed to be fluctuating from moment to moment, either really 
or apparently, owing to changes in the physical, physiological and psychological 
conditions of the experiment. If at one moment the answer heaver is returned, 
for the variable 96 grams, then at that moment the limen is below 96 grams. 
Later the answer lighter, or the answer equal may be returned for 96 grams, and at 
that moment the limen is above 96 grams. The values p in the above table will 
then be integrals of the distribution curve of this men. 


(2) Peculiarities of Psychophysical Data from the Point of View of 
Curve Fitting. 

The problem of fitting a distribution curve integral to such data, say in the 
first place the probability mtegral, has certain peculiarities which differentiate 
it from many biometric curve-fitting problems. 

Usually, when we are required to fit a normal curve, we are given the data in 
histogram form : 


That is, a number M of direct measurements is made, and m, are found to fall into 
a certain short range, m, into another adjacent range, and so on. To fit a Gauss 
curve requires the mean and the standard deviation, and these quantities can 
be directly found from such a histogram, Sheppard’s adjustments being used if 
necessary. 
Quantities analogous to our proportions p can be formed from such a biometric 

histogram, viz. : 

VO ipole 

Po = (m, + m.)/M, 

Ps = (mM + m, + m;)/M, 


Ce ee i i ee rd 
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and vice versa, quantities analogous to the m’s can be formed from the proportions 
p of the psychometric experiment, viz. : 

m = p,M, 

Ms =( po— pi) M, 

Ms; = (ps — po) M, 


In the case of our example we should have : 


Below 84 granis, 7 cases, mM = 0233 
84— 88, rh ae 0084 
S6=-199 4s OTe em 5 9900 
92 96, 72 12400 
96-100, Gross 12533 
100104, 82, » 2738 

104-908 +. 14,” , 0467 

Above 108 _,, 21, » 0700 

Totals a 300 _—,, 1:0000 


There are however important ditterences which make the analogy inexact from the 
curve-fitting point of view. 


In the biometric histogram, if any one of the cells m; is larger than it ought to 
be, then any other must have a tendency to be smaller than it ought to be. There 
is a strong negative correlation between the numbers in the cells, a correlation, 
that is, from trial to trial. In the psychometric pseudohistogram however, formed 
from the proportions p, this is otherwise, because the p’s are measured quite 
separately from one another. 


In the biometric histogram the m’s, the numbers in each cell, are necessarily 
positive quantities. In the psychometric pseudohistogram they may be negative, 
if the p’s do not rise steadily. In the biometric histogram the actual range found 
in a trial is as a rule known, that is the points where p is zero and p is unity are 
known. In the psychometric case these points are as a rule not known, and there 
may be psychological reasons why extreme stimuli (such as would be required to 
find these points) should not be used. In our example we do not know whether 
the subject would have given no answers heavier at 80 grams, or whether at the 
other end he would have given only answers heavier at 112 grams. 


When we do know these points, or can assume them, in the psychometric case, 
we can fit our probability integral by forming the pseudohistogram, and calculating 
the mean and the standard deviation as though it were a real histogram*. This has 
been suggested by more than one writer, in England by Professor C. Spearmant, 
who does not however point out the difficulty that it cannot as a rule be done, 
because the points p=0 and p=1 are not known. 


* The actual arithmetical formation of the histogram is unnecessary if a summation method of 
finding moments is employed. 
+ Brit. Journ. Psychol., 1908, 11, p. 227. 


15—2 
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In biometri¢ language, the problem is to fit a normal curve to data for which the 
“tails” are undefined as to range, although their areas are known. This problem 
was solved by Miiller (op. ct.) as follows: 


(3) The Constant Process. 
Call the stimuli Sis Say (S3500% Sis 
and the proportions p Dinar Disteetins 


then we have n equations 


1 h(s—S) F 
| Eda =O soa ct const eee age een (1), 


Deo 
Vor / 0 
to find the mean S and the precision 4, We retain for the present this form of the 
integral as being more familiar to psychologists. The more modern form would 
have the standard deviation instead of the precision as the second unknown. 


These equations are slightly inconsistent with one another. No pair of values 
Sand h will exactly satisfy all 1 equations; instead of giving zero they leave small 
residuals 2,. 

Miiller assumed tacitly that these » equations if based on the same number of 
experiments each, are of equal importance or weight*. We shall allow this 
assumption to pass for the present but shall return to it later. 


If we now make the usual assumptions of the Method of Least Squares, we can 
take as the best values of S and h those which make 


Y (v2) a minimum, 


where the summation is over the n stimuli or n equations. The conditions that 
this should be so are 


= + (v2) =0 for constant S 

SRS er i. (2). 
ag > (v2) = 0 for constant h 

WO 


Unfortunately, the n equations however are very far from being simple and linear 
as in usual applications of Least Squares. To avoid this difficulty we look up in 


tables of the Probability Integral (which psychologists call Fechner’s Fundamental 
Table) those n values of 


which correspond exactly to our n values of p. These equations are not yet linear 
in S and h, but if we write 


Cia seeuiace diner esate Saleeaeea 1 (4) 
they become y= hs -OS05 eee Neate soins eee (5), 


* There is unfortunately a possibility of ambiguity of language here as the word weight also occurs 
in the particular example we are using as illustration, where weights of 84 grams etc. are employed. 
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and are now linear inh andc. We have now n linear equations in which y and s 
are known, h and c are required. If we insert any pair of values / and ¢ into these 
n equations they will leave residuals v,. If we were now to proceed to make 


> (v.2) a minimum, 
this would not effect our purpose. It is = (v,°) we wish to make a minimum, not 
(v2). If however we can find multipliers or weights M such that each 


(VOOR. 28 ncaa line ae eee eae (6) 


we can then make > (Mo,?) a minimum. 
That is, we can apply Least Squares to the equations (3), weighted with certain 
artificial weights M. The use of this device is Miiller’s particular credit in this 
connexion. 
Clearly the residuals v,, which may be regarded as errors in p, are connected 

with the residuals v,, which may be regarded as errors in y, by the equation 

== eV v, =%, 

Vor 
from equations (1) and (3). Therefore 

M = e-*¥'/ar. 
Herein we can omit the 7 since it is only the relative values of the Miiller weights 
which are of importance. These weights are given in most works on psychophysics, 
e.g. W. Brown, or Titchener, op. cit. 


The condition that ¥(v,2) should be a minimum has now become, that ¥ (M.”) 
should be a minimum. With this substitution, the Normal Equations (2) give 


[Ms?]h —[Ms]c= [Msy] 
—[Ms]h +[M]c =- [My] 
the square brackets being the sign of summation used by Gauss, and still persisting 
in psychophysics in this connexion. The summation here is over the n equations. 
Thence we have 
_ [Ms] [Moy] — [My] Ls") 
[LM] [.Ms?] — [As 
[AT] [Ms] — [Ms] [My] 
= EAT) EM} — [ise Poi (8). 
ee ES eles 
ho [M][Msy] — [My] [Ms] 


h 


(4) The Probability of a Certain Category of Judgment. 


The Constant Process remained in this form from 1879 to 1909. It is very 
much mixed up with the psychological method of experimenting and colleeting the 
data, so that frequently the name “Method of Right and Wrong Cases,” really the 
name of a certain method of collecting data, has been used to include this mathe- 
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matical process. To avoid this mental confusion I have elsewhere* suggested that 
the two words Method and Process should in psychophysics be consistently used in 
the way in which they are employed in the above sentence, viz. Method of collecting 
data, and Process of calculating. Frequently the Constant Process has been called 
the phi-gamma method, from the use of the name phi-gamma for the probability 
function. 


In 1909, F. M. Urban (op. cit.) suggested alterations to the Miiller weights, or 
rather suggested the necessity of another set of weights in addition}. These 
alterations arise from the notion of comparing the judgments heavier with the 
drawing of black balls from a bag containing black balls and white balls. The 
analogy is in detail as follows. 


(1) From a bag containing black balls and white balls 8300 drawings are made, 
one at a time, the ball being returned each time before the next drawing is.made. 
107 black balls are observed out of the 300. 


(2) A subject on performing a certain experiment with weights sometimes 
gives the answer heavier, sometimes some other answer. On one occasion, when 
the weights were 100 grams standard and 96 grams unknown, this experiment was 
repeated 300 times, with due precautions against fatigue, ete, and the answer 
heavier was returned 107 times out of the 300. 


Now if p is the observed proportion (here 107/300) of black balls in a bag, then 
the probable error of p is known to vary with Vp(1—>p)t. With the same sized 
sample, a result p=°5 has a larger probable error than a result p='8, say. If 
anything similar holds, as the analogy suggests, for the psychometric experiment, 
then the n equations (1) or (5) are not equally reliable, even although based on the 
same number, 300, of experiments cach. In addition to the weights M they need 
other weights 


to allow for this new variation in reliability. The combined weights M/4pq are 
known as Urban’s weights, and a table of these is usually given in psychophysical 
textbooks alongside the ordinary Miiller weights. Urban discusses the matter at 
some length in his already cited article, and a discussion will also be found in 
Wirth’s Psychophysik (Leipzig, 1912) where on page 151 the actual scatter of 
various p's 1S given in a diagram. 


* Brit. Journ. Psychol. 1912, v, p. 203. 

+ There are many errors in the article of Urban’s quoted. See my articles in the Brit. Journ. Psychol., 
1913, vi, p. 217, and 1914, vir, p. 44. But these errors, though making Urban’s conclusions in that 
article invalid, do not touch the point here raised, in which I think Urban’s suggestion marks an advance. 

+ Really the true values of p and 1~p should be used but this is the best we can do. And further, 
the expression, probable error, ceases to have an accurate meaning when p is too close to zero or unity 
and the distribution in consequence is very skew. But these refinements do not matter at this stage of 
our argument. 
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Replacing in equation (7) therefore the weights M by the new Urban weights P, 
Urban found in the present instance 
S = 98:24 grams, 
fe OTTO. 


That is he represents the proportions p theoretically by using the hypothesis that 
the “psychometric function,” as psychologists call it, is given by 


1 0°117995 (98°24. — s) 
if 


p= I ee er OTA eee (10). 


The theoretical values p’ thus calculated are compared with the actual values p in 
this table. 


Grams p jo Difference x | 
84 0233 ‘0088 | +°0145 
88 0267 | 0438 —:0171 
92 ‘1167 1489 — 0322 
96 “3567 | B544 +0023 
100 6100 | “6155 — 0055 
104. “8833 | “8319 +0514 
108 “9300 | 9483 —0183 | 


The object of the present paper is to make clear the proper methods (@) of as- 
certaining, in all such cases, whether the theoretical numbers are a reasonable fit 
to the observed numbers, or not, and (b) of comparing the fits obtained by different 
hypotheses, that is by different error functions. The psychologist would express 
this by saying that he was comparing different psychometric functions. To the 
statistician the comparison is one of error functions, the natural procedure being to 
try first the normal curve, then members of Pearson’s family of curves, then 
compound curves; the conclusion in the latter case being that the material was 
not homogeneous. This work I have as a matter of fact already carried out, and 
have come to that conclusion; but it is beyond the scope of the present paper, 
which hopes to interest psychologists in modern statistical methods, and statisticians 
in modern psychology. 


(5) Pearson's Criterion of Goodness of Fit. 


This problem, of comparing the goodness of fit of curves in psychophysics, 
although it has not as far as I am aware ever been correctly performed, is really 
very simple, and could be handled at once from first principles. For the sake how- 
ever of showing the connexion with other work it is advisable to treat 1t as a special 
case of the application of Pearson’s Criterion of Goodness of Fit*, which is in brief 
as follows. 


* Karl Pearson, Phil. Mag., July 1906 and April 1916. 
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Let OME Copa Conment dont 


be a system of deviations from the means of » variables whose standard de- 
viations are 

G1, 82, G3, --- Tn, 
and intercorrelations Tio) M135 1o35 


vee Pain: 


Then the frequency “surface” giving the frequency of occurrence of each possible 
combination of «’s is 


where V=S, (=e ) + 28, (48 Pati) ek Sole ae (12). 


and Ry, Ry, are the minors corresponding to 74, and rz. S, is a. sum over all k’s, 
and 8, 1s a sum over all pairs kl other than k= 


When x? has been calculated, a probability P can be found, from Table XIT in 
Pearson’s Tables for Statisticians. This table is entered by n’=(n+1) and y?, 
and gives values of 


ie ye e 2X dy 
pS 
| yn e 2X dy 
) 


(ee eta 22S eaay 


that is, P is the probability that a random sample of as bad a fit as the data, or 
worse, would be obtained from the theory which is being tested. The kind of data 
for which this criterion was first invented was data in real histogram form, of the 
kind called in earlier sections of this paper a biometric histogram. When the data 
are of this form, Pearson has shown that equation (12) reduces to the very simple 
form , 


where m’ is the theoretical value of m, and e is m — m’, and S indicates summation 
over all the cells of the histogram. Psychophysical data of the kind here con- 
sidered, however, as has already been pointed out, are not really in histogram form. 
Although a histogram can be deduced from them, it is only by making certain 
assumptions, and the intercorrelations of the cells of this artificial histogram are 
ditferent from the intercorrelations of a natural directly observed histogram. 
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It is not correct therefore to use equation (15) above. It is more accurate and 
withal exceedingly simple to apply equation (12) direct to the p’s. Since the 
latter are independent, all the intercorrelations r are zero. Therefore FR is unity, 
Ry, is unity, and Ry is zero. Equation (12) therefore becomes 


x= 8(5) Ce ee ter ...(16), 


and as the distributions of each p will be binomial in form provided the experimental 
conditions remain constant enough we have 


C—O) | fe casa aston doed sae sohies ees Chi 
where « = the number of experiments on which p is based, and p’ = 1—q’, so that 
2= § 4) ear re one ee 18 
he Gin (18)+ 


Herein the ’s are the differences between observed p’s and theoretical p’s. The 
probability P is then found as before. 


* Tf we look upon the judgments heavier, as suggested in an earlier paragraph, as being comparable 
with drawing black balls out of a bag containing black balls and white balls in the proportion p’ and 


1-—p’, then the probable error of p is ‘67449 J ® mae , & being the number of judgments of which 


pu are of the category heavier. 
For the chances of obtaining 0, 1, 2, ... 4-1, or w black balls in a drawing of u are given by the 
terms of 


(p' +4)", 
q' being 1-p’: that is, the chances of obtaining 


_0 1 2 w-l be 
“i je fee Vb we 


p 


The s. d. of the above binomial is Vup'q’ and the s. d. of p therefore Eig ee : 
Me a 


+ Compare Professor K. Pearson on ‘‘ Goodness of Fit in Statistics and Physics,” Biometrika, 1916, 
XI, pp. 239—261, especially p. 257. 
- We can check our equation (18) by treating the matter from first principles, and not as a special case 
included in Pearson’s formulae. We have, from this point of view, n quantities p which are independently 
measured, and n quantities p’ which are theoretically given. The variations from p’ are binomial in 
form, that is, they are approximately Gaussian. The probability of an error 


Bi , 
T= Pr-Pk 

= pace 

: Mb Onl’ 

is therefore w, = Se 


Ei Jia Men eee Masieenacmem ort snine ant na Verrs (a). 


The probability of the whole set of observed values p,, pz, p3, ... Py occurring is the product 


PW OER ern eae aae GRE CAc earn nit tr Ce Or er CER Onmee ee Tee Tee (b). 
Write this z=me 2% , 
a) 
Then Y= 8 (4,) 
Pq 


from equation (a). 
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(6) 


Numerical Example. 


Criterion of Goodness of Fit of Psychophysical Curves 


Let us apply these formulae to the example already cited. The calculations 


are carried out in the following table. 


as denominators of the terms of Ne 


The theoretical p’q’s should be used, clearly, 


Bo Ng | pig | a? | x?/p'q! 
0088. 9912 | -00872 » -00021025 | -0241 
‘0438 | 9562-04198 | “00029241 0070 
1489 | “8511 12673 ‘00103684 | -0082 
3544-6456 = *22880 “00000529 “0000 
‘6155 = *B845.—«| 23665 = 00003025 ‘0001 
‘8319 1681 | °13983 | -00264196 | -0189 
‘9483 0517 | 04903 =~ -00033489 ‘0069 
= | — | = — 0652 = 8 (?/p’q') 
| be 


The number of experiments was the same for each p, viz. 300, therefore 
px 
~=S (a) = 300 x 0652 = 19-56%, 
Pq 


The Table XII in Pearson’s Tables to find P has to be entered with x? and 
n’ =(n +1), where n is the number of variates, here the number of p’s, i.e. 7. We 


find there 


j—3, 19) 0081S i — 2. aye 20 0005 (0 


It is unnecessary, with data such as we are here handling, to interpolate elaborately. 
Clearly, for y? = 19°56, P is of the order 
foie 


That is to say, in only seven cases in a thousand should we expect to get our 
present observed p’s from our theoretical p’s by random sampling. It is therefore 
not at all probable that the equation (1) truly represents the “ psychometric 
function ” for this subject and this reaction. 


(7) 


In the article from which the above example is taken, Professor Urban was 
inter alia desirous of comparing various hypotheses of the “psychometric function ” 
among themselves. Those which he fully works out are (1) the above assumption 
that it is the integral of the normal probability curve, and (2) the assumption that 
it is an “arctan.” curve (tan 6), (It is needless to point out surely that the latter 
hypothesis is in itself most unlikely; however, we are here concerned with an 


Urban’s incorrect method of comparing Goodness of Fit. 


empirical comparison of the two hypotheses, and it is important that the method 
should be correct since it will be necessary to compare other and more likely 
theories, as for example Pearson’s curves.) 


* Compare Appendix. 
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It can now be shown that the methods which Professor Urban employed in 
comparing these two hypotheses are incorrect and inadequate. What these in- 
adequate methods are can best be shown by continuing the above example, which 
is taken at random from among Urban’s material. 


We have already found the squares 2® of the differences between theory and 
observation in the case of the normal integral, or as Urban calls it the $ (y) hypo- 
thesis. They are given in the table just above, and 


S (a2) = 00455189. 


We now proceed to form the analogous quantity in the case of the arctan. hypothesis. 


Grams _ | Observed p p | 4 x? 
84 0233 ‘O795 — 0562 00315844 
88 0267 “1086 — 0819 00670761 
92 LUG T 1682 — 0515 00265225 
96 3567 *3259 +°0308 | ‘00094864 
100 “6100 6464 —:0364 | -00132496 
104 8833 8222 +°0611 | *00373321 
108 ‘9300 “8872 + 0428 00183184 
— — —— = 02035695 = S' (.v?) 
1 


Urban now compares the ¢ (y) hypothesis with the arctan. hypothesis by comparing 
(00455189 with °02035695, 


and deciding that as the former is smaller, therefore the ¢ (vy) hypothesis is superior. 
This procedure is firstly maccurate and secondly inadequate. It is inaccurate 
because not S(a?) but S(a/p’q’) should be compared, and it is inadequate because 
no idea is given whether the observed difference is significant or not. 


The former point deserves a little more examination, because it is another form 
of an error which Urban was the first to correct, in this same article. In the form 
of the Constant Process as it left the hands of G. E. Miiller, certain weights are to 
be used on the observation equations. These weights may be called Miiller’s 
weights. Urban pointed out, however, that they needed amendment, and published 
(loc. cit.) a table of weights to replace them. These weights differ from Miiller’s by 
the factor 1/4pq, which arises in Urban’s treatment from an application of what he 
calls Bernoull’s Theorem. It is these very Bernoulli weights, 1/pg, which Urban 
himself has omitted in his above comparison of the ¢ (vy) and arctan. hypotheses. 


In order to discuss the inadequacy of his comparison we need a measure of the 
probable error of the quantity P used above, 
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(8) The Probable Error of x? and P. 


We have 7 = ws + Fie, (from eqn. 18). 


If the accurate values of p’ were known, the variation of x? would be due entirely 
to the variations in the observed values p. In point of fact, of course, the p’s 
which are available are themselves functions of the p’s: but like Pearson in his 
1914 article on the probable error of a coefficient of contingency *, and for the same 
reason and with I think the same justification, we shall assume that the p’s do not 
vary. Then the mean square deviation 


2(p—p')? pen ay 
= 1S |o,*( 7 ) = pS A = Sy? ecco 19). 

‘ "\ wg "BY 2 
Therefore the probable error of y? calculated in the way suitable for the Constant 
Process and other processes for fitting psychometric functions 1s 


674bo. 0 = 1849 Wy? acticin ee (20). 


In the case above where y?= 19°56, its probable error is therefore about 5-9, so 
that we have 
NOG Eo. 
We must next find y? and its probable error for the arctan. hypothesis. The 
calculations are partly carried out above in finding S(a?). Completing them we 
obtain the following table : 


p q' q'p' ed x?/p'q 
0795 "9205 ‘07318 00315844 0431 
“1086 8914 09680 ‘00670761 0693 
"1682 8318 13991 00265225 0189 
3259 ‘6741 *21968 00094864 0043 
6464 3536 | *22856 00132496 "0058 
8222 ‘1778 =| 14718 00373321 0254 
8872 1128 | 10006 00183184 0183 
— — — a 1851=S («?/p'q’) 


== Soe 
Car D4 


Probable error of x2 = 1349 Vx? = 10-0. 


For arctan., x? = 55:53 + 10°0. 

For ¢(y), x? =19°6 + 59. 
Difference = 35'°9 + 11°6, 

where 11:6 = V10-02? + 5°92. 


* Biometrika, Vol. x. 
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The difference is therefore three times its probable error and is just significant. 
The final conclusion is therefore that in this particular case the arctan. hypothesis 
is just significantly worse than the normal integral or ¢ (y) hypothesis, but that the 
latter itself is very improbable. The P of the normal integral hypothesis it will be 
remembered was ‘007. The P of the arctan. hypothesis can be found from Table XII 
of Pearson’s J'ables. The entry has to be made with n’ =7+1=8, and y?=55'5, 
and we find given P = ‘000000, i.e. it is less than 0000005, showing how very im- 
probable the arctan. hypothesis 1s. 
The probable error of P is discussed by Professor Pearson in the Phil. Mag. for 
April 1916 and he shows that the standard deviation 
Cie AR cE pO Magid rar eane htewsw ented ants (20), 
and using equation (19) we get 
Cpa teen lend ry, INOUE Case” vas wciaeraiec.3o5a0e0 (22). 

It must be borne in mind that n’ =n +1, where n= number of variates. In our 
case therefore, n’ is one more than the number of stimuli. P,,_, 1s similarly 
obtained from Table XII of Pearson’s Tables by entering with the column with 
heading one Jess than the number of stimuli. For the above ¢(y) hypothesis 
we have 

ye 1 9'6, 

Number of stimuli = 7, 
P or P, = ‘007 approximately, 
i — 002 ; 
op=(P,— P,) x = 005 V19°6 = 022, 
Probable error of P = 67450, ='015. 
Therefore for the ¢ (y) hypothesis the criterion of goodness of fit is in this case 
P=-007 + :015. 
It is most improbable, therefore, that P is at all large, and the fit is significantly 
a bad one. The probable error of P for the arctan. hypothesis is too minute to be 
found from the table. 
The calculations we have performed have been for Urban’s Subject IV (heavier 

answers). For his other data similar calculations can be carried out. The arctan. 
hypothesis is usually worse than the normal integral, but not always significantly 


worse, and the normal integral itself is an atrociously bad fit to the data in 
most cases. 


» (9) Summary of Rules for Testing and Comparing Goodness of Fit 
of Psychometric Curves. 
Let there be » stimuli, and let 
Dis Diss oe Dn. 
be the theoretical frequencies at these stimuli, and 


; Pis Po, Ps ++ Pn 
the observed values. Let Pan Le n-ne fl, 
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be the number of experiments at each stimulus. Calculate x’, the sum of the 
quantities 
c eae Fe ga ss! 

Then in Table XII of Pearson’s Tables*, in the column n’=n+1 and the row x? 
(interpolate if necessary) find the value of P, the probability of obtaining the 
observed p’s or a worse set, from the p’s by random sampling. 

The probable error of y? is here approximately 1°35 and the probable error of 
P is approximately 

6745 (Paii— Pra); 

where P,,,; is P itself, and P,, 1s the value found in Table XII by using the 
(x —1)th instead of the (x + 1)th column. 


APPENDIX. 

What value will be obtained for xy? if, in the example used above (normal 
integral hypothesis), we were to proceed by first forming a histogram and then 
treating this histogram as though it were an ordinary directly observed one, 
Le. using equation (15) above? The cells of the histogram will be occupied by the 
quantities m= dp x w (observation) or m’ = dp’ x w (theory) where 6p is the change 
in p from one stimulus to the next and w the number of observations at each 
stimulus, here the same throughout. 


p op | p dp’ ee 5 = dp — 6p’ e? / 2 e?/m' 
| | 
-0233 | ‘0088 ‘0145 ‘00021025 | -0239 
‘0233 ‘0088 
0034 | ‘0350 ‘0316 ‘00099856 | -0285 
‘0267 ‘0438 
‘0900 ‘1051 ‘O151 ‘00022801 | +0022 
‘1167 | +1489 
2400 | ‘2055 ‘0345 ‘00119025 | -0058 
-3567 | +3544 
‘2533 2611 ‘0078 -00006084 | -0002 
‘6100 ‘6155 | | 
| | 2733 2164 ‘0569 ‘00323741 | ‘0150 
| *8833 *8319 
‘0167 1164. | 0697 ‘00485809 | -0418 
‘9300 | 9483 
0700 ‘0517 ‘0183 -00033489 | -0065 
= 4 | = = _ — = 1239 =S (e?/m'p) 


whence x? = 300 x ‘1289 = 37:17, instead of the proper value 19°56. If the calcula- 
tion is performed in this inaccurate way, therefore (by analogy with data which are 
really in histogram form), a very wrong idea of the closeness of fit would be 
obtained. The reason, as stated above, is that the correlations between the cells of 
the histogram derived from an ogive with independently measured p’s are not such 
as to lead to equation (15). 


* Tables for Statisticians and Biometricians, Cambridge University Press, 1914. 


ON CORRECTIONS FOR THE MOMENT-COEFFICIENTS OF 
LIMITED RANGE FREQUENCY DISTRIBUTIONS WHEN 
THERE ARE FINITE OR INFINITE ORDINATES AND 
ANY SLOPES AT THE TERMINALS OF THE RANGE. 


By ELEANOR PAIRMAN anp KARL PEARSON, E-.R.S. 


Part I. Non-Asymptotic Curves. 


(1) We have in recent practice found the importance of full corrections for the 
moment-coefficients in the case of singly and doubly curtailed blocks of frequency 
such as are indicated in the accompanying figure. It has not been adequately 


recognised that even the mean of such distributions is not correctly obtained by 
grouping at the midpoints of the subranges h, and merely finding the mean of 
these concentrated groups. Still less is this a correct process in the case of the 
higher moment-coefficients. The practical statisticians, aware possibly of the exist- 
ence of “ Sheppard’s corrections,” have been warned that they are only exact for the 
case of high contact, and regarding this have in their doubt neglected all corrections 
whatever. Now Sheppard’s corrections are still valid when there is no high con- 
tact, and they should therefore always be used, but they form only part of the full 
correction * and may indeed merely amount to some 50 °/, of its value, although 
75°/, is a more usual average proportion, if the frequency block does not end in 
finite ordinates. We propose in the first part of this paper to deal with frequency 


* In certain cases although part of the full correction they are in the wrong sense, and therefore if 
used alone would be worse than the raw moments, 
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blocks such as are indicated in the figure above, reserving for the second part the 
treatment of the corrections needful when the frequency curve asymptotes to the 
frequency axis, Le. the cases of J- and U-shaped frequency distributions. The 
general treatment of non-asymptotic frequency blocks will follow the lines of 
pp. 282-8 of the paper: “ On the systematic Fitting of Curves,” contributed in 1902 
by one of the present writers to the first volume of this Journal. 


(2) The method there adopted started from the Euler-Maclaurin formula : 


oT dZ hh d&Z’ 
12" Te 720 da? 


he <aZ’ Wy at Z’ h® OZ). Bd Ze ik 


2, 


Dds OT fT) EL ee Nea 


FX 


* 30940 da® 1209600 da’? * 47900160 da® 12! dat 


where Z’ is any function of w and Z,, Z/, Z,,...Z,' are the p+1 values of this 
function corresponding to p subranges taken from a =a, to v=, of the range J. 
Clearly ph=l=a,—a. By, By... are the higher Bernoulli numbers. The first 
term on the right involving the p+1 values of the function Z’ is the “chordal 
area”; the term between square brackets depends on the values of certain differential 
coefficients at the ends of the range, and these again depend on the form we assume 
for the frequency curve in the neighbourhood of the terminals. The value we are 


Ly Z 5 
going to take for Z’ is #°Z, where Z is the integral [ ; ydu, or, y being the frequency 
~ & 


ordinate, Z is the total frequency on the section a, — # of the range. In evaluating 
the limits we need not proceed beyond the ninth differential, for the 11th vanishes 
for s=5 with our assumptions for Z, and in our experience of actual frequency the 
ninth term as a rule contributes very little to the total correction. In order to 
obtain our results we must assume some form for Z at the terminals of the fre- 


at 7=a,, 4=0. We shall assume Z given by high order parabolae in the neigh- 
bourhood of the terminals, 1.e. 


ad, (&# — 2) 7 (ly (4 — XP a dg (2-2)? > a, (2 —%)* . as (e= ay 
tt oh Dole mele Bens ANS Ties Sie ae 
in the neighbourhood of # =a), and 


-_ b, (Lp — wv) b (Lp — x)? b; (Xp = x) by (Lp — x)" bs (Xp = =) 
ed Ge hp 21. We Be ee Ae eer mas 


in the neighbourhood of w = ay. 


These lead at once to 


de | de, pee 2 
( ae = Nafh and ae. =N(-lytihe ees (ii). 


Exactly as in the earlier memoir we shall determine the a’s and b’s from five 
frequencies adjacent to the terminals of the range. In many cases, however, 
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e.g. deaths in infancy, disease incidence in infancy, wages, incomes or house-valua- 
tions, we have details for the ends of the range on different subranges to those for 
the bulk of the curve. These modified subranges will be termed h, and h,, and 
when either of them is less than /, we shall get more accurate corrective terms tf 
they are used instead of the frequencies on subranges h. At the same time it must 
be remembered that in calculating the value of the chordal area terms, sufficient 
of these hy and h, subrange frequencies must be clubbed together to give sub- 
frequencies on ranges h. 


het the frequencies on the first five subranges /, or h from «=a, be Nn’, Nn’, 
Nn;, Nnj, Nn.j, so that m/, ne’, ny, ny, n;) are proportional frequencies, then 


3 


Gh. Ob 0h, Oh ie 
tO oh a Sa 


N (1-n,')= W(1+ Seasesu ane 


nO Oe, SOs NO 0, 
i oe eT On ae ae aise a 
Sunilarly = n/ oe No, = a ee ts 92 4 a 23 +4 ii 9s a. = 95 
= ny = ny — ng = FB + FB + 2 B+ Fi Bt 58 BF, 
—1y — Ny — Ns — nf = a 4+ > 4? + a 43 + +7; 4a 4 +e 4°, : 
— Ny =, — Ns —y —Ns = a 5+ a Be x 5a + A oe = 5, 


Solving these equations we find 
a, = — gy {187n,' — 163n,' + 137n, — 63n, + 12n,'},\ 


a= qy{ 45n —109n.’ + 105n, — 51n + 10n;)} 
dz3=— +{ 17n/— 54m’ + 64ns) —84n + Tg}, \ occ. (111) 
C= { 83m — 11ni+ 15ni—- 9ni+ 2n,/}, 
ds=- {| mr— 4ni+ Gni-— 4n/+ ns}. 
Similarly we find for the 6 coefficients 
=+ A, {187n’, - 1631’ pa t+ 137 1'p_s— 63n'p_3 + 12n'p_4}, 
i iz {| 45, —109n,. + 105’,_, — 51n‘p_3+ 10n'p_4}, 

‘5 4 c { 17n',— 540,44 + 64n' »2— 340'p-3+ Tn'p-a}, poste (iv) 

== | op Ln, + -lon,.— 9,32 25-4}, 

b,= { n'y — Ang + Gn'po— 4 p-3t+ 1p, 


where Nn'‘,_,, Nn',_3, Nn'p-2, Nn',4, Nn'p are the five successive frequencies ad- 
jacent to the terminal # = , of the subranges /, or h as the case may be. 


Biometrika x11 16 
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Since dZ/dx = — y, it follows that 


= Na, _ N (pays +6 / Oy, 7 WD, If / 
to = =F = Gg, (EBT! — W3ng + 137 HJ — 63m, + 121s}, | 
NO, Yi RAGA) 


Yo = 


:s = 60h, {137n', — 1680, + 1387n',_. — 63n',_s + 12a’, 4] 

These results enable us to determine approaimately the terminal ordinates of 
the frequency distributions given by sub-frequencies, and to discover how nearly the 
frequency curve comes to zero at the terminals of the range. Similarly the small- 
ness of the quantities a,, a;, d,, a; and b,, b;, b,, b; marks the character of the 
terminal contact. At the same time the reader must remember two points (i) that 
the terminal frequencies if small may be subject to large probable errors and 
(ii) that we have supposed y=0, when w=, and #=«,, the terminals of an integer 
number of subranges. It is extremely unlikely that the frequency curve would cut 
the variate axis ewactly at such places. Hence on both counts, (1) and (11), we must 
not anticipate in actual practice that a, and 6, will vanish at # =, and # =a, for 
non-abruptly terminating frequency, unless we know a priori the terminals of the 
range and have chosen our subranges to fit this knowledge. 


(3) The next stage in our work must be to table the values of 
aZ'/da, dZ’/da, ... PZ’ /dz', 


where Z’= Za" at the two terminals of the range. We may do this for s=0, 1, 2, 3, 4, 5. 
The theorem of Leibnitz provides the needful expansions which are 


= oe 

d?Z’ Bile Z é mk Z om — Bag 
pe a qa t 88 (8~ 1) at? + 8(s—- 1) (s—2) a 94, 

LZ’ OL c UZ —P LZ 
ae = ae + 98x°" ars + LOs(s — 1) a Le + 10s (s —1)(s — 2) a— a 


7 


+58 (8-1) (8 2)(8—8) a9 4 5 (9 1)(s— 2)(s—3)(s—4) > Z. 


Cf 22s (s—1) a ee + 35s(s—1)(s—2)as~ C2 +355(8—1) (8-2) (8-3) ae 
+ 21s (s—1)(s—2)(s—8)(s— 4) aw 54 


+ Ts(s—1)(¢—2)(s— 8) (s— 4) (s— 5) a9 


+ s(s—1)(s—2)(s—38)(s—4) (s—5)(s—6) a7 Z, 
aL’ IZ v7 


“qqe = 126s(s— 1) (s—2)(s — 3) as ae 5+ 126s (s— 1) (s— 2)(s —3)(s —4) # 


+ 84s (s — 1)(s — 2) (s — 3) (s — 4) (s — 5) a*6 ae colt Sheena er (v1). 
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Now all higher differentials of Z than the fifth vanish, and therefore we may 
d' Z' a Z’ du Z’ 
cancel the first two terms of aa and the first four of da? * The value of —— An 


starts with the term 462s (s — 1 (s — 2) (s — 3) (s — 4) (s — 5) as S 


a - and accordingly 


this and all terms beyond vanish for s=0, 1, ... 5, or this Ate ee of Z’ is 
zero for our purposes. 


We have now to give s in succession the values 0 to 5, and subtract the result: 
for the first from those for the second terminal : 


Role | poo la ee” 
ee oe |e 
sel: kak =(-07? mje 1) N, | 
Se lr=(- Geet ep (e-e)™ 
aie — +E?) +5 (p4- pa) ®, | Sar |” =0 if g>2 
5 Oe cal = (- (oe v mf) 20) N, 
Se |r=(- (eet eer 6(h+8)) x 
So - CGE RE) e-em tes 
eh @ Cm m* aio 9a 
a 


if 

BZ’ vy a Ly? Xo? c Ly” Ly a 

RS ie — = (7 ar Cs ia) +9 ( 2 hy C ‘Fs a 18 (6.53 ie Do Oy “oN i) 6) N, 
Ly" 

(— (oi +55) + 15 (b, he 


it a 


— 60(2 7% + ae 7) + 60 (F.- -7))%, 


UZ’ |%p Ney Geechee. a 
ey (an TG Gy eee gee a eel 2 e 
| da? i ( ee Ge hy, a hyt h + za a hy! “x 
PI ZF’ |x, ee 
16—2 


agen tng 0) Secgtaten | 
aie AS 
> S 


ES 
BN 
eS 
Ss 
g Ss 
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as dZ’ Ly se / i Lo! ~ 
Seale Fak =(- 6, ie — a 5 — dad) W, 


a Vf a id on Ue 
Pale => fe u I h,? + As hh i) + 12 (5 i? — Ao a 


— 36 (6 pe ee ) — 240) 
eles Lhe 0 


aN 


AZ’ |*%p a) i Ly” 
Ee IL = (- (3, 52 hy? - + As hs 7) +20 (6,52 hy! i slg Rh mt) - 120 (6.52 hy? + As =) 


+240 @ Hag 3) ~ 120 Ge f 
Ore he hy 
X b; 

wi) +840 (bs 7% Ss a7") -840(75 = 


97" |%p HZ! |X, 
aE mela ne Saar | =O - go 


dara 
_ [dZ" |*» (ON Lee P 
De kan =(-b, ie — a Te — 5a‘) W, 


BL! Vin ty? a 
ala (- (0.7 a ji) +15 (0 Fe a, Fs) 


— 60 (6.5% +a #*) — 60m") W, 


h, 


— 
a| > 
&|N 
[amen 
a. 8 
Il 
| 
Pa 
> 
cal 
asa 
E 
a 
ee is? 


Ly Lp ms 
pete . ) = 600 (7 +a, 7) 


a2’ 1% Xp? p Xy" 
= (A205, 2 
Ee i ( B20 (0, hy? me he is) + ee (72 h,' -— ho ) 
— 4200 (5, 72 igo the ia) + 2520 (i 
AZ’ Xp by Ws 
Rak = (~ 15120 (8, 7 it i re) 15120 (rs : - 7) iu, 


d2qa+1 7’ x, 
aay = 0, q > 4... 


b, 
he 


2 Ly on 
vig) #8 (hie ay 5 “) — 200 (0s a 4 


120) N, 


“3a)) ® 


(vii). 


Ly 
(4) We have now to see the relation of the present integral | ” Za da to the 
Xo 


moment-coefficients. Ce by parts we have 


, ee Sie Es N 
Zatda = | ~— Penge . 
IE ae ES i a cea so eee ieee” 


where p’s,, is the (s + 1)th moment-coefficient about the arbitrary origin. 
ingly we have, changing s to s — 1, 


; S:{% .. 
Le =a + WN oA hace Erte a tens Big pe este ee ae 


Xo 


Accord- 


ELEANOR PAIRMAN AND KARL PEARSON 237 


Thus we can write 
8 & 
id q és 
bs = 2 + WV Ci4- WV Te, 
where (_, is the “chordal area” term and J,_, is the limit term of the Euler- 


~ Maclaurin series, or if 7, = # + wh, 
C2 —) {AZ a8) + Ziti LG tesa api Oh 
YT yS—1 a Zoas- he d® Yh »>S—1 
eh [a wee we h WA) : Co- 


dx 770" de  +30240 dar — 
hi (Zoe) (Za 
— 7209600 de’ * 47900160 dae |, 


We now turn to the evaluation of the chordal areas. We can obtain these by 
remembering that if ny be the frequency on the qth subrange h, 
%p 
Z, =| Yd = Noi + Ngta +... + Np. 
%q 
Thus it follows that 


p 
Os, =h S {has + (ay +h) + (ay + 2h) + 22. + (ay + (U— LT) hy} a, 


u=1 


if we note that Z, =0. 


But the series coefficient of », can itself be summed by the Euler-Maclaurin 
Theorem, 1.e. 


h {kas + (ay + hy + ... + (a + (w— 1) hy} 


da fey da 


he a | ayotuh 


da? 


e 0 


Ly +uh l (gs) B (as) 
— | ti dt — th (a tL Th ae. ae E he d (@ ') ee ira * C (a : ) 
x : 


+ 


30240 
Xo 


1 2 
=—th(a,+uh)s1+ 5 (a, + uh) +(s — 1) ie (a + uh) 
—(s —1)(s— 2) (s—3) gd gh! (a + uh) 
+(s—1)(s—2)(s—38)(s—4)(s—5) gph" (a + uh) — ... 
1 he 
— <a — (8-1) 5 a? +(s —1)(s — 2)(s—3) pAghtae 
a (s ee 1) (s =< 2) (s oe 3) (s ea 4) (s ia 5) soap ny + sees 
We are now in the position to find the value of sC,_, for the successive values 


of s. We have 


1 Pp 
WV (sCsa)saa=a S {[-th+am+uh— a} ny 


p X ‘4 , 
S {a +(u—4s)h} n,— 7 IST Ge) eet = a AR (x), 
=1 u=1 
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where v,/= moment about the origin of the subrange frequencies concentrated at 
their mid subrange points. Similarly 


= OO.) = ws, {— h(a + uh) + (a + uh)? + Eh? = vP — EM} ry 
1% iL Lye 2 
=a a \(ay + (w — 4) hy — th? — a7} ry 
Sy LIA HS ids eats ate Sree we gk ens eee Gal): 


» 8h, 2 h2 
= AG ie) ea * S ie BE (a) + uh)? + (ay +uhy + i (a) + wh) — x3 — 5 a Nu 


u=1 { 


1 
=o N {(@o + (u—4) hy — 4h? (ay + (u— 4)h) — a2 — kh?.a} ru 


w=. 


SVs She = PIP ay SOG cx teeocsanadecaget sbaeodes pense (xi)s 


ay (60 nar)on -;4 S {Qh (a, + uh) + (ay + uh)! + h? (ay + why 


u=1 


dght — at — h?ae + ayh'} nw 


1 
= 8 N pCzas — Ah) — 4 (ay t+ (u— 4d) hp t aghast — a7} re 
=p — ; Dbvs, + ES = crt — Pat ceases abesteneeneu feet eeeie (xii1); 
1 1 1 De r 5 5 Le 3 
(SC, ene WV S {— Sh (a + uh) + (a + uhy + Bh? (ae + uhy 
a u=1 
— tht (a + uh) — 2° — BP? + th4 xo} Ny 
il 
= ee 8 jee —a4)hy— Sh? (a + (u—4)h)y 
+ Joh (ay + (u— 4) h) — a8 — Sa? + Eh} rw 
= vs — BP yg + Ful?! — 08 — BRE A EM ay 2. reese cece evens (xiv); 


1 1 2 een 
NV (sCs1)s—6 = Y, oi {— Bh (ay + uh) + (a + uh) + 3h? (a + uh) 
Eh! (ay + why + gh’ — 08 — gla + ghia? — ah} nw 


aE 224 
=— § {(a+(u—4)h)— Sh? (a, + (u—$)h)! 
N u=1 
+ ht (ay + (u — 4) hy — Bhs — ao — Shira + hha? rm 
=, — Shy + dehy! — @hi — 48 — RR at t+ ghia? veers ee (xv). 


(5) We can now put together the complete formulae for the corrected moment- 
coefficients about any origin from the values we have obtained for the component 
parts of (ix), but we may first simplify our notation slightly so as to abbreviate 
somewhat the lengthy resulting expressions. We write pp =h/hp, po =h/h); these 
will very frequently be unity. Next we put ,’=b,p,, As’ = Uspo°, and ap/h= Gps 
a,[h =a, ; thus for terminal units the same as for the bulk of the frequency we 
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should have b,’ = b,, as’ =a,, and for working units where h generally = 1, w,' =<», 
2 =x,. We find 


pa = +h {A (ay — Aye + abe ts) + de (OY — aos’ + ache ds))} wees ee ee eee (xvi). 
VA 9 Ve 
[re Ve — pyh? +h {— zh (a — 735% DEG 140 Ors Bg 0s ) 
iy / 7\) Peisis 
+ bay (ay — Gods + aaa ds) +b Lp (br — gobs’ + apgbs')} -...- ere (xvii). 


rh 9 i: / i / i: / / / 
Bs = v3 —thev/ +h? {— ay (a! — Peas + atts) — th (ON — Bebe + ata ds) 
: / / / 5 id 
— F5%o (a, = qbca ya Top ‘(b, T3004 ® Pa ay? (ay — 30% + a5 304s 


I52 
He Eh Oy ain Os, He ape Un) Movads ths oeeas she nsetisen vhaidege yea dels nave Duets (xvill). 
by =v — bh vy + sight + Mt {ade (ae — Hay) — ghey (be — HOY) 
a To%% (a nes 83 As, ae 40 Qs = i0 Up Oe : gy0s + 340 b, :) 
— a? (ay — ae oe + gy ty? (be — phe bs) + bao? (ay — Aoas + gcag Gs) 
BA Diels Ost aang OF JN eaten vaeds sauSeU sean bea Swe dea ss cones ed eeaieres (six) 
fs =v5 — Bh? vs + ae +h? {555 C= gn ts + gigas’) + abe (bY — dob,’ + chads’) 


+ hp (a! — gos’) — goa %p (by — gobs’) — £40" (ay — Peds + ghyds) 
4p (by — Pros + gh bs’) — tym? (Ge -— 7 ae Dt Tg hp * (Ox — 78501 ) 
+ yay! (ay — gods’ + apap ds’) + yap? (DY — gods’ + astegds)} -.-. Pree ae (xx). 


2520 


Me = Ve. ae BhPy, + ighvy — ph’ + hs {— ala (a, Ca a == a5 (b, = Hy b/) 


. ‘ fi tp 7 Hf PRAY , aD ii 0 , 
+ fy% (Qi — dos + zips) + fe Xp (, 0 Tbs! + de ) + Zo? (te — gas) 
aE ) 


/ / P, i os 
— qykp” (bs — — #4 oes 5X ; (a — B5Ms Tg 40 5 9) i B! 5 Ds. + gt955 
— $454 (de — beds) + ba, (by — ababs) + 4.40 ul — quay + ery 
BPO poy. ——“atOg pi aete Os Jt) Ss seseaaevewasineasaceecbeedsdvsessses se seme Ni 6.<.00) 


The first series of terms outside the curled brackets are precisely the Sheppard’s 
corrections for the moments which accordingly still remain essential portions of the 
corrective terms even when there are final terminal ordinates and any degree of 
abruptness in the slopes at the end of the range. We may speak of the a’s and b’s 
as the “abruptness coefficients.’ They are determined by equations (111) and (iv). 


It will be clear that the terms in the curled brackets repeat themselves, so that 
in working as we usually do to the fourth moment we have to deal only with eight 
functions of the abruptness coefficients, 


° 


The next stage is to consider how equations (xvi)—(xxi) may be most. ad- 
vantageously arranged for practical statistical work. In all such work the subrange 
his taken as unity. Hence we may always write it 1. Further, the origin is at 
our choice, and it might seem desirable to take it at the mean. But there are two 
means, namely the true mean pu, of the data and the mean »,‘ of the concentrated 
groups; with abruptness these are no longer identical. If we take moments about 
the true mean, v,' 1s not zero and our calculations are not simplified by its vanishing. 
On the other hand if we take moments about the mean of the concentrated groups 
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we shall then have to transfer the w’s to their true mean. Nothing is therefore 
gained practically by the process. Besides this, neither mean is a good working 
origin, and if we take such to calculate the moments in the first place we may have 
two transfers to the means, one for the v’s and one for the w’s. Further #, and a) 
will both have to be calculated and all the terms used. We can, however, get rid 
of shghtly less than half the corrective terms, if we take moments about one end of 
the range* and then transfer the y’s to their mean. This appears to us in practice 
to be the best policy, for, although it involves taking the differences of large numbers, 
it is quite easy with modern mechanical calculators to retain the requisite number 
of figures for accurate results. Accordingly we will rewrite our formulae with these 
changes, remembering that x, is now the range 1 = ph = p. 


fa = + fs (ay — pods’ + adage’) + pe (bY — gods’ + adage dpe eee BPnchon 0-0-4) 
Pa = V2 — Ay + [— gay (Ge — thes) + hy (Oe — habe) 
+ hp (by) — aby + a55)s ‘)} Ce aD See e-o.c Ui) 


/ VA , 5 
Py V3 ty, a5 {— ran (a, Sar ls. + aty a an 
, 
+ ayP (be — zhebs) + 4p? (0) — 


~ a (b; — Be sbs + ah5b5) 

st aphry Os )h Posse oe eee (xxiv). 

js = Vy a dy," SF 340 aF {ate hs rs ot a, ) a | 1. (b./ — : on b,) = Lop (b,’ ws b,’ ata sha bs’) 
(bs! — pba bs) + Fp? (OY — pods’ tages )} ee es ee eeeeeecereeeneees(XXV), 


+ / 5 / 7 / 


Ms =¥s — GUVs bag + 


eo 


(3 By (ay a {pay a rigs ) a rey (b,' ld) a5 ris ) 
— 730? OSs gobs) — 4p? (b/ — 2 + bs’ a gtybs )+ qep® (by — we bi’) 


acd (by ane nr iy) en ae Fe ttevce esi ety ee {CSany) 
pe =e — ee + apy — 4G + {- bw — Bal) + hh (br — fe by’) 
Pep (by — gobs’ + zhgbs’) — fp” ee sobs) — 3p? (bi — Fzbs' + g4ybs) 
+ $p' (by — z3qbs) + dp? (OY — Abs + apg bs )b cc eeceee ee ece ees yon (xxvil). 


(6) We now propose to illustrate the degree of exactness with which it is 
possible to obtain the moment-coefficients of curves with marked degrees of abrupt- 
ness, and further to investigate in practice the extent to which small terminal range 
elements may be of advantage. We will commence with some mathematical 
frequency distributions for which it is possible to calculate the exact values of the 
moment-coefficients. 


Illustration I. Moment-coefficients of the common parabola y = Vx x 100,000 
from «=0 to 10. This is a good case for a test, for the curve rises vertically at 
x, = 0, and therefore, theoretically, our equations fail. At a = 10, we have a finite 
ordinate and finite abruptness coefficients. We are hardly likely to get a case 
wherein the abruptness causes greater changes in the grouped frequency moments, 
or to which it is less possible a priort to apply merely Sheppard’s corrections. 


* The distances of the successive concentrated groups are }h, 2h, $h,.... In taking moments it is 
convenient to use 1, 3, 5, ete. and then before substitution multiply v,’ by (°5)s. 
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We divide the space from «= 0 to «= 10 into ten subranges giving the following 
system of “ frequencies ” : 


Absolute Proportional 
frequencies frequencies 
ny 66,667 ny ='031,623 
Ny 121,895 My == '057,820 
Ns 157,848 nz ='074,874 
4 186,923 ny == "088,665 
Ns 212,023 ms ='100,571 
Ny 4 234,440 W' p—4= "111,205 
Ny —3 254,888 Wy—-3= "120,904 
Tip 273,811 2’ 2 = 129,880 
Np —1 291,505 » Vp ='188,273 
Ny 308,185 NM, ='146,185 
Total frequency = 2,108,185 1:000,000 


These lead to 


Abruptness Differentials and Ordinate 


Calculated Actual 
a= —°0131,0643 Y = 27630°78 O 
dyg= — '0444,8167 Yo = +93775°59 ioe) 
a3= °0258,4150 Yo = — 5447866 oa) 
d4= — ‘0148,8400 Yn = +31378°23 oe) 
a,= °0045,0200 yo’ = — 9491:05 oa) 


Now it is clear that the ordinate of our auxiliary curve is not zero, but it looks 
larger than it really is relative to the ordinate at the other terminal which is 
31622777 so that the ratio is only ‘087, or if the curve be actually drawn to any 
reasonable scale, the ordinate of the auxiliary curve at the vertex which is less than 
one-tenth of that at the other terminal, looks relatively small. We may also com- 
pare it with the ordinate of the actual curve at « =1 which is 100,000, or between 
three and four times as great. Similarly the abruptness differentials are not infinite, 
but their values in the actual curve are very considerabie at #=1 and are then: 
yi = 50,000, y,” = — 25,000, y," = 37,500 and y,’ = — 93,750. Thus the first two 
for the auxiliary curve are about double, the third of the same order, but the fourth 
is much less. Clearly all this is a result of the fact that we cannot expand Vw in 
a series of integer powers of x, and this is one of the reasons why we selected it. 
We want to determine whether the formulae give a very bad result for the moments 
even in the case of extreme abruptness. Accordingly our real test lies in the values 
of the deduced moment-coefficients and not in those of the abruptness differentials. 


We pass now to the b’s and find: 


Calculated Actual 
b;= °1499,9857 = ole, 224-74 316,227°77 
by = — -0074,9283 Yo. = — 15,796°27 — 15,811:39 
bs = —-0003,9450 Yr" =— 831-68 — 790°57 
b,= — 0000,2600 Yp = — 54:81 — 11859 
b; = — ‘0000,3800 Yi = 80°11 - 29°65 


Here again the values of the terminal ordinate and the abruptness coefficients, 
although good in the former case, are only approximations and the real test must 
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depend on the moment-coefticients. If we go as far as pw, we have to calculate 
the following eight eens 
ay! — gods + agoqas’ = — (0131,0643 — -0004,3069 + :0000,0179 = — 0135,3533, 
de — zagd, = — Severe + :0005,9063 = — :0438,9104, 
ay — ibe + siya; = — °0131,0643 — :0020,5092 + :0000,1876 = — 0151,3859, 
dy — gay = — 0444,8167 + °0013,0235 = — -0431,7932. 


The ae we require are 
Js (Gy) — pods’ + abbas’) = — 0011,2794, shy (ae —=5,0,) = —0003;65106; 
qa (a! — das’ +ahpa;) = —°0003,7846, zh (de — day’) =— :0008,4269. 


It will be seen from these results that a,’ does not contribute very much and 
a, still less to the final corrections. We now take the b’s and find 


(by — gybs’ + asad’) = °1500,0513, (bo — ~85by) = — 0074,9273, 
(b — fsbs + o45bs) =°1500,2972, (by — Hd,)) = — 0074,9055. 


~ 


Whence we deduce for the abruptness functions, since p= 10: 
Ay (by — abs’ + aebyp bc’) = 0125,0043, dp (by — aby ee ’) = -2500,0860, 
dp? (by — Aybs' + axaqbs) = 3°7501,2825, dp (by — abs’ + gta ds’) = 50°0017,1000, 
bs (be — zig by) = — 0000,6244, arp (bi — 73,b/ ae 0001,8731, 
gop? (be — 735b4') = — 0374,6365, dy (by — gy bs’ + g150s') = 0037,5074, 
dap (by — 2ybs’ + gipds’) ='1500,2972, 4, (b. — 2b) = — :0000,5945. 


We now give the values of the grouped moment-coefticients about the origin. 
Alongside them we place their values as corrected by Sheppard’s terms. We then 
give the values as found by full correction formulae and lastly the actual values as 


deduced by integrating the parabola. 


Values with i 
Sheppard’s Values with full 
Raw moments corrections corrections Actual values 
vy 59880 vy 59880 py 59994 60000 
vy, 426900 vg — a5 42°6067 peg’ 42°8570 42°8571 
vs, 331°0854 v3 — ra 3295884 ps, 333°3349 333°3333 
vy 2698°7735 vy —dv +at5 2677°4576 py QA 2797 2727°2729 


It will be seen that the fully corrected results are in most excellent accord with 
the actual values. Sheppard’s corrections, although component parts of the general 
corrections, move if taken alone in the wrong direction, i.e. they lower moments, all 
of which need to be raised. Thus while Sheppard’s correction lowers the fourth 
moment by about 21, our new corrections raise it by about 50, the result being the 
requisite raising by 29. 

It seems to us unlikely that a more unfavourable case for our abruptness 
coefficients could be found. It certainly emphasises the point that to obtain very 


* The a’’s and the b’’s will in this case be equal to the a’s and b’s. 
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good results it is quite unnecessary for the terminal ordinates and_ differential 
coefficients of the actual curve of frequency and the auxiliary terminal curve to be 
closely identical. 


(7) Illustration II. Now let us take a normal curve containing 1,000,000 
individuals with a standard deviation of unity, and let us suppose the frequency 
grouped on 0°5 x standard deviation subranges, the first such subrange being central. 
Then, adjusting to units, we have the following system : 


— ‘25—.4+ 25 197,414 
ae 925 S15 174,666 
+ 7 5—=-— 1525 120,977 
+ 1°25— +1:75 65,591 
a rg eee 74 27 834 
+ 2:°25— 42°75 9,245 
+ 2-°75— + 3°25 2,402 
+ 3°25— + 3°75 489 
+ 3°75— + 4°25 78 
+ 425— 4+ 4°75 10 
+ 4°75— + 5:25 1 


To test the error introduced by our adjustments, take second moments for the 
complete curve about the centre of the group from —°25 to +'25. We have 


vy, = 0, vy = £083,394. 
Using Sheppard’s correction as abruptness coefficients are zero, we have 
fs = 4:000,061 in working units, 
= 1:000,0152 in actual units. 
Accordingly ¢ =1:000,008, which is a quite good approximation to unity. The 
error introduced by our adjustments for omitted decimals is therefore not great. 
(a) We will start first with the singly truncated normal curve given below 
and h, =h, i.e. the area from «= 1:25 onwards, 


Moments about 


Frequencies stump 

65,591 vy = 1:029,513 
27,834 vo = 1°693,994 
9,245 va’ = 3°883,416 
2,402 v4 = 10°974,937 
fen Total 105,650, Mid ess ed 

78 (in working units) 

10 | 
1 


and determine its moment-coefficients, as a frequency curve having high contact at 
one end and marked abruptness at the other. In this case all the b’s are zero and 
we only need to find the a’s. Clearly a’ =a. 
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Using (11) we determine from 
mn,’ = °6208,3294 the values: a,’ =— ‘878,708, 


Ne = ‘26345480 dy = ‘607,984, 
m3 = '0875,0592 (ts = — *296,843, 
M4 = '0227,3545 a, = 081,723, 
ns = '0046,2849 Gs =" 2005) 30: 
Whence 
ay’ — yds + apg as = — 873,758, ay — 73,04 = + 604,741, 
Qy' — Peds + gipds = —°855,125, a, — ga, = + 600,833. 


The corrections due to these abruptness coefficients by (xxii)—(xxv) are for /, r2’> 
v; and vy,’ respectively 
— 072,818, —-005,040, +:021,378 and + 004,769, 
while the corresponding Sheppard’s corrections are 
0, — ‘083,333, —°257,378 and — ‘817,830. 


Thus we deduce for moment-coefticients about stump: 
With Sheppard’s 


Raw moment corrections Full corrections 
vy 1:029,513 1:029,513 956,700 
vy 1°693,994 1°610,661 1°605,621 
ve 3°883,416 3°626,038 3°647,416 
vy 10:974,937 10°157,107 10°161,876 


These values are in working units ="5 actual units. Hence in actual units 


we have 
With Sheppard’s With full 


Raw moments corrections corrections True values 
vy *514,756 vy 514,756 py’ “478,350 ‘478,8131 
vy, -423,498 ve — ay 402,665 pe’ 401,405 *401,4837 
Vs 485,427 v3 —4vy ‘A538 250 iret *455,927 *455,7714 
va 685,934 vy —tve t+5ho 634,819 pea 635,117 *634,7360 


It will be seen from these results that our full correction values for the moments 
about the stump are in every case accurate to 1 in the 1000, while, if Sheppard’s 
corrections only are made, we may be out nearly 1 in the 100. The change in the 
mean, second and third moment-coefficients 1s very noteworthy. In the case of the 
fourth moment we are out ‘0004 in °6347, while the Sheppard’s correction alone is 
out only ‘0001 in ‘6347. The cause of this irregularity we have not been able to 
detect, although we have examined carefully the whole of our arithmetic. It 
seemed accordingly worth while inquiring what differences would occur when the 
moment-coefficients were taken about the mean and not about the stump. 


With 
Raw moments Sheppard’s With full 
about mean corrections corrections True values 
Vy "158,524 v9 — 44 (5)? 137,691 pe 172,587 172,222 
V3 104,226 V3 "104,226 ps 098,801 098,612 


v4 149,090 »y—400(°B)2+-y75(5)! “131,097 4 ‘156,767 "156,405 ° 
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It will be seen that now Sheppard’s corrections are wholly inadequate and our 
corrections are essential, even in the case of the fourth moment-coefficient. This 
confirms the view of Sheppard himself, who insisted on the importance of high 
contact at the terminals, if they are to be used alone. It is a convincing illustration 
of the fallacy of those “ proofs” of Sheppard’s corrections which do not appeal to 
the principle of high terminal contact. 

We now propose to illustrate the degree of improvement in the exactness 
obtained, if we calculate the abruptness coefficients on smaller subranges. Accord- 
ingly we break up the terminal group 65591 on ‘5a base into five groups each on 
‘lo base. These are 


17142 leading to n, =°1622,5272 and a, =—-1728,9281, 


14979 ny ='1417,7946 d= - *0216,4780, 
12959 Ms ="1226,5973 a, = — (0010,4603, 
11099 ny ='1050,5442 a, = — (0002,3651, 

9412 n; ='0890,8661 ad.= “00003781, 


: h\s 
whence, remembering a,’ = () a, = 5*as, we find 
vo 


a,’ = — 864,464, and = A, (a’ — gas + geap ds ) = — 071,853, 
Oo D495, — hy (ay — 737 0/) = — 004,559, 
a, — = 130, (94, — 45 (a — Gas teas) = 021,351, 
ay = — ‘147,818, tha (a2 — ay) = 004398, 


Ce eS too: 


Thus the moment-coefficients become 


w, = °957,660 or in actual units °478,830, 

fe =. 1:606,102 ‘401,526, 

Ms = 3'647,389 455,924, 

bes = 10°161,505 635,094. 

Transferring to the mean we have 
On ‘5 subranges On ‘1 subranges Actual values 

ji 478,350 ‘478,830 478,813 
i ‘172,587 172,248 ‘172,222 
‘tg 098,801 098,705 098,612 
ie ‘156,767 156,516 156,405 


While the first column of values would be amply adequate for most. statistical 
purposes, the second makes a still closer approximation to the actual values, the 
ditterences being only 

— ‘000,017, +:000,026, +:000,0938, +:000,111 
as against — 000,463, +:°000,365, +:000,189, +:°000,362 


respectively. The greatest improvements are in the mean and standard deviation. 
Accordingly it is well worth using smaller terminal subranges, if they are available 
as in the cases of cricket scores, wages, house values, infant mortality and other 
frequency material. 
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(8) Illustration IT. (b) We now propose to consider the moment-coefficients 
of a doubly truncated normal curve. We will take the portion of the above 1,000,000 
distribution with unit standard deviation from variate value 1:25 to variate value 
3°75 and divide it into five groups, ie. 


Absolute Relative 
frequencies frequencies 
65,591 °6213,5637 
27,834 *2636,7693 

9,245 0875,7969 

2,402 0227,5462 
489 0046,3239 
Total 105,561 Total 1-:0000,0000 


Using (111) and (iv) which now involve all five groups we find 


a, = — °8794,4917, b, = — (0088,5527, 
y= °6084,9651, b,= -0258,2393, 
ls = — °2970,9362, b, = — ‘0401,0477, 
a,= °0817,9157, b,= -0530,8779, 
at; = — 0057,4076, b,= 0057,4076. 


From these results, since a’s = a’’s and b’s = b’’s we have for the abruptness functions : 


al,’ — gas, ar aah noe = > 87449989, b,’ bares aybs’ + xea0 0s = 0031,8458, 


Gs ea = *6052,5081, 6, — 78,0, = -0237,1727, 
Qy — Ps As + zhgds = — °8558,9423, by — Py bs + yhpbs) = — 0006,4848, 


Il 


°6013,3975, bo! — gab,’ = ‘0211,7875 
About the first terminal we have for the raw moment-coefticients 


», = 1:025,630, —». = 1'668,535, v3; =3'733,743, v= 10'108,966, | 


eee / 7 i 


and by (xxi1)—(xxv) the corresponding corrective terms are 
— °0731,4037,  —-0074,9994, + :°0044,7459, — -0981,1557, 

leading to the Sheppard’s correction moment-coefficients i actual units : 

fy ="512,815,° po ='396,300, py = °434,667, po = 681,492; 
and the full correction moment-coefficients : 

py = "476,245, pe’ = "894,425, ps =°435,226, pe = 575,360. 

We now transfer to the mean of the block and find 

fy = "AT0,245; a = 167,616, = 087, 180d, Sie — 123/69 
while the values for the Sheppard’s corrections only would be 

jy = DL2 815, o fis—"133.320 oe , — 104,701, ji — 100, a 
The theoretical values for the normal curve block are 

fy, = 476930, 9 eg = 168,025, Se pg 05957 30) fy = 133,748. 
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It will be seen that the Sheppard’s corrections alone give very unsatisfactory 
results, and that while the full corrections for the first three moments are statistically 
satisfactory, the approximation of the fourth moment-coefficient would not for 
certain investigations be adequate. We are in fact using only five groups and 
trusting to these for the accuracy of our abruptness coefficients. We will accordingly 
now test what improvement arises when we divide our terminal groups into five sub- 
groups and calculate the abruptness coefficients on these smaller subranges. Thus 
we have h ="5, hj =h,="1, and therefore p, = p,=5. Our subgroups are: 


n,=17142 therefore n,’ = '1623,8952, Mp-4= 173 therefore n’,_,='0016,3886, 

Ny = 14979 ns = °1418,9900, Ny_3 = 124 Ny -s = '0011,7468, 

n, = 12959 ny! ="1227,6314, iiegas 88 n'y» = 0008,3364, 

n,= 11099 . ny = °1051,4300, Ol Hp = 0005,7 786, 

Na— DAT2 n;, = 038916172, Ny = 43 nw, ='0004,0735. 
65591 489 

Whence 


a, = —'1730,3848 and a = —°8651,9242, b6,= ‘0003,5810 and b= :0017,9049, 
(ly = + °0216,6599 dy =+ '5416,4969, b,= °-0000,5368 b, = ‘0013,4204, 
(tx = —°0010,4679 ds = —'1308,4851, b= “O001 5157 b, = °0189,4639, 
a, = — 00023683 ay = —'1480,1868, b, = —:0000,7579 bf = — '04738,6598, 
a; = + °0000,3789 ad; = +'1184,1494, 6,= -0000,3789 bf = ‘1184,1494. 


Determining the abruptness functions from these values, we have 


Gy! — hy Ge + yep Os =—°8629,6462, by’ — py bs + oAyg bs = +°0015,2171, 


ee oe = 5475,2345,  / —73,b/ = + -0032,2164 
ty! — Psy + yhgay’ = —'8543,1522, —b,’ — by’ + g45b/ =+°0007,8021, 
CET Ne = ‘56047508, by — gb, = + 0054,8656. 


Working out the corrective terms for abruptness we find them 
— 071,787, —-003,368, +°031,252,  +4:071,446 
in working units, leading to 
fa = "476,922, uy = 395,458; - pes = °438,5735, ps, = *585,957, 

or transferring to the mean we have 

Ma = "476,922, pp = 168,008, us = 089,722, py = 183:788, 
as against the actual values 

fe AN OO30NN Wis LOS 025505 4 — 089,100, 9 4, = "133,143, 
an eminently satisfactory agreement. It is thus clear that when possible it is 
desirable to obtain the abruptness corrections by small subranges—in this case ;'; of 
the standard deviation. Hence any terminal small range groupings such as are 
frequently provided in statistical data are useful from this standpoint. In fact if 
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the abruptness coefficients are found from such small groupings the remaining sub- 
ranges can safely be made fairly coarse, as in the above examples, where five 
divisions of the total range are clearly adequate. 


(9) Illustration III. Mean Age and Variability of Infants at Death. It is 
very important in practical statistics to obtain the mean and standard deviation of 
J-shaped curves. A good illustration of such curves may be found in infantile 
mortality statistics. These have the advantage that in the early part of the year 
of infancy the frequencies are in certain cases given by much smaller intervals. 
Thus in the Prussian official statistics they are given for the first fortnight by days. 
Professor Raymond Pearl in a paper of 1906 (Biometrika, Vol. Iv, p. 510) has 
endeavoured to ascertain the mean age at death of infants in the first year of life 
from the Prussian data. It will be of interest to determine what changes are likely 
to be made in his results by the use of our present abruptness corrections. He 
writes (p. 512): 

It is evident that the grouping here [i.e. in the Prussian data] is sufficiently fine to make 
possible a very accurate determination of the mean age of death.... A standard month of 
30 days was assumed: then with a unit of 30 days the first and second moment-coefficients about 
an arbitrary axis were determined. From these the position of the mean and the value of the 
second moment about it were easily found. Only the “rough” second moment was caleulated, as 
it was deemed sufficiently accurate for present purposes, and furthermore it was difficult to deter- 
mine the proper corrective terms to apply in this case. In the calculations each frequency 
element was for practical convenience centred at the midpoint of its range. The error made by 
so doing is negligible. 

With our present corrections we can test how far the errors made by concentra- 
tion at the midpoints of the subranges are really negligible. It is certainly right 
to concentrate at those points provided we allow for terminal abruptness which is 
very marked in this case. If we make the proper terminal corrections theory shows 
that quite considerable subranges, say in this case one month, may be used to 
determine the raw moments. It will be sufficient to illustrate the method on the 


Prussian male infant deaths. 


We have deaths per 1000 infants born : For the birth terminal we have*: 

Months Deaths Days Deaths 
Q—1 63°99 0—3 18°25 
1—2 22°59 oO 6°58 
oe 18°58 6—9 7°89 
Seal 15°96 9—12 5°65 
ahh 13°30 12—15 5°82 
5—6 11°51 : 
6=—7 10°61 

7—8 9°30 

8—9 8°74 

S10 8:29 
10—11 751 
P12, 6:94 


Total 197:32 
v= 3°759,224 
“ye! = 25°809,801 | 


* Three day intervals taken with a view to smoothing anomalous values. 


in months. 
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The month subranges will be quite adequate at the childhood terminal. 


As the results are based on 1877—1881 averages, we shall suppose the month 
to be 30°4375 days. Thus h/h, = 10:145,833, h/h,=1. We find 


my! = -0924,8936, ay, = — °1877,2687, o/=— —1904,640, 
ny = '0333,4685, dy= ‘2966,9657, ay = 30°541,331, 
ns = '0399,8581, a, = — °3909,0057, ds; =— 408:253,057, 
ny! = 02863369, a,= °8117,2715, a, =  3303'128,589, 
ng = 0294,9524, d= 199.7730. a,’ = — 12253-408,881 ; 

N yp—4 = 0471,3156, pe 0, — 0309) 

n'y = 0442,9353, b,! = bs = — 004,823, 

n'y = 0420,1297, ees eSAb 

yp -1 = °0380,6000, b, = bs = — 012,670, 

W, ='0351,7130, b, =b;= 004,967. 


From these we deduce 


En eta, ta) 008,008, . 4b (bo —ahbrtavenbe)= 002,961, 
zy (de — bg 04) = — 837,793, zhy (bo — 735s’) = — ‘000,036, 
whence Total abruptness correction on v,' = ‘006,054, 
» : vo = °843,679. 
Thus fy = 3°765,278 months, fy = 26°570,147 (months)’, 
using of course Sheppard’s correction. ; 
Finally we reach 
Mean = 11461 days as against* 113-07 days, 
Standard Deviation = 10715 Pe 7 105°44 


obtained from taking the raw moments of small elements of one day up to the end 
of the first fortnight. Thus, if we desire to get a mean within 1°5 °/, of the correct 
value, it will be well to adopt abruptness corrections. 

(10) Illustration IV. In view of the fact that in the previous illustration the 
infantile death-rate curve has probably an infinite initial ordinate.it seems well to 
measure, in a case which can be tested, the degree with which our corrections give 
the actual values of the moment-coefficients in such a case. 


We choose the curve | y= Le 3, 
and suppose ten subranges going up to the terminal #= 10, from «=0, 


* Pearl’s results modified by taking the average month to be 30°4375, not 30 days. 


Biometrika x11 1B 


We have 
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for the “ Further, for small terminal 


subranges, we have: 


frequencies’: 


x Frequency 
Onto 1 1-000, 006 0 x Frequency 
Loe Peete 0 to 2 447,2136 
2 to 3 *317,8372 9 to °4 185.2419 
3 to 4 -267,9492 Ferd 149.1415 

fs eae) ieee “4 to 6 142,1412 

4 to d *236,0680 6 to °8 -119.8305 
5 to 6 213,4217 S08 kigeecon 
6 to 7 -196,2616 : es 
7 to 8 "182.6758 
8 to 9 “i101 29 
9 to 10 “162, 2777 


Total 3° 162,277 7 


It will be 


h/h,=1. Thus we have 
vp,’ = 3°394,907, = 20°016,109; 
n, ='1414,2136, @, = —‘2332,9561, a,,=— 1:166,4781], 
no ='0585,7863, (lg = + '2583,1707, a, =+ 6°457,9265, 
nz = 0449,4899, a; = — '2657,4026, Gy = —-33°217,5325, 
nz =°0378,93738, ty == + °1798,6052 a, = + 112°412,8250, 


5 = 0393,9000, 


n'y 4 = 0674,8987, 
p39 — 0620/6337, 
M pag O01 Od LO, 
(pe Us: so OJ, 
nw’, =-0513,1672, 


These values 


dy (ay’ — gods + geag ds) = — 057,1279, — zha (de 
qi (bY — gobs + astgbs) = 004,1669, cha (0 


aie a we Baa 


= 3'394,9066 — 052 
fy = 20:016,1090 — 083,3333 + :066,7155 = 19°999,4912, 
My = 8'°830,8953. 


and 
which gives 


For comparison we have 


Raw moments 
3°3949 
20-0161 
9 84907 
2°9139 


It will be seen that Sheppard’s 


results. 


a; = — ‘0586,1091, 


by = b, = + 050,0105 
b,’ = bo = + '002,4535 
b,’ = b, = + '000,4817 
b, = b, = —:000,0497 
b, = b;=+ °000,1316 


9616 = 3°341,9450 


Using only 


ane 


ns) 


— 183°159,0938. 
Actual values 


+ °0500,0000, 
+ :0025,0000, 
+ :0003,7500, 
+ ‘0000,9375, 
+ '0000,3281. 


of a’s and b’s lead to the abruptness functions : 


Sheppard’s 
corrections Full corrections True. values 
3°3949 33419 OD oD 
19°9328 19°9995 200000 
84074 8°8309 8°8889 
2°8995 29718 2°9814 


it would be better to take the raw moment results without any corrections. 
other hand the full corrections even in this extreme case—where (a) the Euler- 


sufficient to take the subranges unity at the other terminal, or h/h, = 5 


a,,) = — 016,6425, 
(00,0205. 


corrections alone are worse than the raw moment 
In other words they should certainly not be used alone for J-shaped curves; 


On the 
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Maclaurin Theorem fails theoretically, (b) our auxiliary curve is unreasonable, for 
a * cannot be expanded at the origin terminal in powers of #—are found to give 
results within 4 °/, of the true values for both mean and standard deviation. 

The variety of illustrations we have taken seems to suggest that for most 
practical statistical problems—even with J- or U-shaped distributions—we shall 
obtain reasonable results from the system developed in the first part of this paper. 
At the same time the method adopted indicates that for the best possible results in 
asymptotic frequency curves it may be needful to use a more suitable auxiliary curve 
for the asymptotic terminal. This leads us directly to the second part of our paper. 


Part II. Cases of Asymptotic Frequency. 

(11) In selecting our auxiliary curve to give the first five frequencies we must 
remember that it has (i) to give an infinite ordinate but a finite frequency, (i1) it 
must be of such a character that its constants can be readily determined. 

If we adopt 

Z=N(14+a1(A + Be + Ca? + Da? + Ex')), 
where q is chosen less than unity, we have the adequate number of constants and 
y =—aZ/dzex is infinite when « = 0. 
If we leave g undetermined, however, we should have six not five constants and 
might then omit #. But the process of determining A, B,C; D and q would be 
very laborious and involve a troublesome series of approximations. We are ac- 
cordingly thrown back on the retention of and an arbitrary choice of g. Olearly 
to give an infinite ordinate and finite area we may give g any value from slightly 
over zero to slightly under unity, and the size of g measures so to speak the intensity 
of the asymptoting. This is probably rather an important feature of the frequency 
curve, but as we see no way of determining it accurately without very great labour, 
we give q its mean value $. Accordingly our problem becomes that of determining 
A, B, C, D and E s0 as to give the first five frequencies or the values of Nn’, Nn., 
Nn;, Nny, Nn; as before. After a good deal of work they are found to be 
(A =—1°64964,84755n,/ + 3°35035,15245n,’ — 3°72071,6287 4n,/ 
| + 2:05278,64045n, — °44721,85955n,;, 
} | B=+ -91328,76419n,’ — 5°50337,90247 ny + 7°10669,19065n,' 
— 415163,83427ny + °93169,49906n,', 
C=— °31317,72759n, + 2°64515,60574n,' — 430806,06243n,' 
+ 2°76448,01733n, — °65218,64934n,, 
D=+ -05299,17797n,’ — °53034,15536n,’ + 1-:00172,31390n, 
— °73032,76686n, + °18633,89981n,', 
E =— :00845,36703n, + -03821,29964n./ — ‘07963,81338n,' 
\ + -06469,94335n,/— -01863,38998n,’. 
The large number of decimals is requisite owing to the high coefficients they have 


to be multiplied by in ascertaining the values of the abruptness coefficients. 
12 


(xxvill) < 
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Now our scheme of action is of the following kind: we shall obtain the abruptness 
coefficients at « = 1, or at the finite ordinate of the first trapezette, for here they 
will be finite. We shall then trust to our auxiliary curve to give the moments of 
this trapezette about # = 1, using the integral 


: J AZ 
lt, | (a — 1)’ yda =— | (a—1) --- da. 
Jo Jo da 
And lastly we shall determine the moments and the corrections of the remainder of . 
the curve by the process already discussed as if it had to be applied from #=1 
onwards*, The moments for the trapezette before a = 1, and for the remainder of 
the curve, must then be added together to get the total moments and so the moment- 
coefficients about «= 1. The transference to the centroid then proceeds in the usual 
manner. 


Moments of first trapezette n, about non-infinite ordinate : 


Ny fy” = 2N(4A44+14B4+10+1D4+744), 
(Xx1X) ‘Mype =— BN (GA +35 B+ gC + gy D+ 13), 
XX1X + : ‘ 
mos = 16N (A+ 7h5B+5h, 0+ ay D+ 754); 


a) 
URE QR 1 / 1 1 (1 1 1 
1) My =—128N (st;4A+ cis5 B+ siosC + ets D + iiss). 


Again remembering that 


a2 (%4 
6 (aes)? 


(a, =4(A4+3B4+504+ 7D +4 94), 
a,=}(—-A+3B4 150+ 35D + 634), 
(xxx) {@;=2(4 —B+5C+4+ 35D + 1052), 
a,= 3;(-54 + 3B—50 + 35D + 3154), 
dy =f (835A — 15B + 150 - 35D + 8158). 


we find 


If we now substitute (xxvili) in (xxix) and (xxx) we shall obtain the moments 
of the first trapezette and the abruptness coefficients at # = 1 in terms of the first 
five sub-frequencies. We have 


Ny py = —"812,7818n, +°677,0691n,—'660,5497n; +347, 1889n,—'073,7827n;, 
ype = °706,7407n, —'824,1137n, +'830,5586n,— "44152182, + 094,357 2n,, 
CL) [Ree eee a ee 

nypul'= °581,4517n, —'854,1149n, +'888,8688n,—'478,0407n,+102,7607n,, 


* The abruptness coefficients in the previous case were determined from the five frequencies following 
the initial ordinate; here they are found from the four frequencies following and the one preceding it. 
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and again 


a =— °'067,9063n,’ — 1°651,2396n,’ + 1:177,1875n, 
— 554,8634n,0+ +111,8034n,', 
b= 332,2458n,' + °915,5792n.’ — 2°384,2525n,’ 
+ 1:368,5243n,—  °298,1424n,’, 
i. dy =— °988,7796n,' + 2°823,7204n,’ — 2°126,0271n, 
(xxxll) + ng 
+ °472,0491n,— °027,9508n,’, 
a, = 2°4.97,6496n,’ — 9°939,8504n,’ + 13°394,6734n,’ 
— 7:822,9490n, + 1°677,0510n; , 
dy! =— 7°413,4958n,/ + 25°320,8792n,' — 33'899,3138n,/ 


\ + 20°768,5399n, — 4°856,4601n,’. 
Here as before n,’ =1,/N. 

We have accordingly to add the values given by (xxx1) to the expressions for 
the moments for the remainder of the frequency corrected for the abruptness by 
means of the series (xxx1i). We propose to illustrate our results on one or two 
numerical examples. 


(12) Illustration V. The following data provide the years of survival for 
10,000 persons, male and female, born in England and Wales with congenital 


malformations*. 

Age at death Male Female 

Years 0-—1 8762 8753 

=) 393 339 

3 140 150 

3—4 95 80 

4—5 86 69 

510 185 184 

10—15 90 132 

1340) 86 86 

2025 63 : 52 

P15; 310) 45 40 

30—35 9 40 

340) 18 ef 

40—45 9 6 

45—50 9 3} 

50—55 5 Hit 

55—60 —_ 6 

60—65 5 — 

i 65—70 = 6 

70--75 = 6 

Totals 10,000 10,000 


Now consider how we should endeavour to find the mean and standard de viation 
of such series under the old method. We clearly cannot use Sheppard’s corrections. 
If we concentrate the deaths in the first year of life at 0°5, we shall certainly get 
too high a mean. Now Pearl has shown by taking Prussian statistics (Biometrika, 


* Registrar-General’s Annual Report, p. 207, 1913. 
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Vol. rv, p. 515) that as deduced from data registered at short intervals of days, the 
mean of the total population of infants dying in the first year of life should be con- 
centrated at 0°3 instead of 0°5 year of life. But our infants with congenital mal- 
formations undoubtedly die earlier than the great bulk of normal infants. We 
night therefore hazard a concentration at 0°2; but this would be mere guesswork *, 
and what is more would not provide the proper corrections for concentrating in the 
case of other years of life. We obtain, however, by this process the following 
results : 


Male Female 
First Year concentrated at: 0-2 0:3 0-2 0°3 
Mean 1:2436 1°3313 15077 | 1°5952, 
Standard Deviation 45932 4:5734 57750 | 5°7532 


The differences between the 0:2 and the 0°3 results are considerable and it will 
be found from the sequel that the 0:2 results are closest to the corrected results 
for both mean and standard deviation in the case of the male and the female. 
Indeed a quite reasonable result might have been reached by centring the deaths 
in the first year of life at 02. But such a priort guesses must be at best risky. 
When we proceed to apply our method by cutting off the first year of life, we note 
at once that in this case, as in many other of a like J-distribution character, 
a grave difficulty arises, namely we have starting from the group 1—2 not got the 
groupings in year or five year ranges, for we have cut off the first of our five year 
groups. We cannot therefore straight away apply our formulae based on the 
Euler-Maclaurin theory for equal subranges. The suggestion that at once occurs 
This of course would make no change 
in the first raw moment 7’, which would be the same whether we grouped into 
year or five year subranges on the supposition that we simply spht up our 
frequencies into five equal groups for the five year periods. But there will be 
a change for the second and higher moments. For the second moment the total 
frequency of the five year group (na) centred at # has to be multiplied by a? + 2h?, 
where h=+ of the subrange =one year, and similar corrections can be easily 
obtained for the higher moments. Of course this distribution of each five year 
frequency into five equal one year frequency groups is not satisfactory, but» with 
the irregular data as given it is, perhaps, as good a result as we can hope to get, 
until official statisticians recognise the difficulty and table their statistics in a manner 
to meet it, ie. 1n this case, it would mean either proceeding by four year groups 
after the 4—5, or giving the 5—6 frequency and then proceeding by five year 
groups 6—11, 11—16, ete. 


is to take year groupings for our material. 


* Actually our auxiliary curve gives 0'210 for males and 0°205 for females for means of deaths in the 
first year of life. 
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Assuming the legitimacy for the present purposes of this redistribution in year 
groups we find for moments round the end of the first year of life : 


Males Females 
1238p," = 9446 1247p, = 12079°5 
12387, = 207,007-25 1247p,” = 331545°75 
Again we have for males: 

n, = 8762, and by (xxxii) a,’ x 1238 =— 1122°222,84.40, 
fe = 393, a’ X 1238 = —3041534,5378, 
ns, = 140, a; x 1238 =— 9518°206,0741, 
ng OS. al < 1938 = 19254345,0950, 
Ne = 86. ag X 1238 = — 581964928841, 


Our abruptness functions are thus found to be 


123854, (a) — das’ + aa Gs’) = — 83°056,6602, 


60 


123851, (ae — 7354) = 18:978,9435. 


These provide for the moments about 1: 
1238p," = 9446 — 83:056,6602 = 9362°943 3398, 
1233805 = 207,007:25 — 103:166,6667 — 18°978,9435 = 206885'104,3898. 


We now find from (xxxi) the values of 
8762p," = — 6921:345,3000, 


and 87624." = 5951:033,6815. 
Thus: 10,000m,' = 8762," + 1288y,'" = 2441°598,0398, 


10,000p.' = 8762p," + 1238,” = 212836:138,0713, 


or fy = 244,160, py’ = 21:283,614. 


- Thus finally the Mean = 1:2442 years and the Standard Deviation = 46069 years. 


We now turn to the female deaths and find with the same notation : 


1247p,” = 120795, 1247 v,"" = 331,545°75. 


Here n, = 8753, leading by (v) to 1247a,/ = — 1014°250,5807, 
Np = 339, (24a, = 294.9°801,0796, 
n, = 150, 1247a,, =— 7980°615,3654, 
n, = 80, 1247a/ = 19991:399,2722, 


n; = 69, 1247a,) = — 60065-060,3135. 
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Hence we deduce : 
1247 x 35 (a! — does’ + gctog as) = — 75°422,972, 
1247 x 45 (a — 73g a’) = 17:970,765, 
and 1247 w,/" = 12004:077,028, 
1247 ,."” = 331545°75 — 103:916,667 — 17-970,765, 
= 331423°862,568. 
Again by (xxxi): 
8753," = — 6961°151,020, 8753p.” = 6002°499,428. 
Thus: pa = (8753 m," + 1247 4,/")/10,000 = 504,293, 
fy = (8753 p,” + 1247 4,/”)/10,000 = 33°742,636. 
Accordingly we have for females : 
Mean = 15043 years, 
Standard Deviation = 5°7869 years. 


These are both in fairly good accord with the result that would have been 
obtained by the a priori guess of 0°2 for centring the first sub-frequency. 


(13) Illustration VI. It is not without interest to inquire if this centring 
of 0-2 maintains itself when we turn to other material for congenital malformations. 
We can use the material provided in the United States Census for 1899—1900, 
Vol. 1v, p.670. From the data there given we deduce that for 10,000 congenitally 
malformed individuals of either sex born: 


Died in year of life Males Females 
0—1 9626 : 9543 
1—2 129 204 
2—3 61 57 
3—4 27 49 
4—5 14 25 
5—10 54 4] 

10— 15 34 4] 
15—20 20 8 
20—25 14 8 
25—30 — 8 
30. -385 a — 
35—40 _ = 
40—45 — — 
45—50 = = 
50—55 —- 8 
55—60 14 — 
60—65 — = 
65—70 — 8 


| 


LS} 
S | 
S) 
oO 


Totals 10,000 


| 


We have as before: '’s, 374y,/" = 2657, 2s, aba, = 2595.5; 
S14, =f 175s 457y,"’ = 76280°25. 
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We now turn to the abruptness coefficients at the end of the first year of life 
and find: 


Males | Females 
= 9626, and by (xxxil) | 2 =9543, and by (xxxil) 
374u,/= — 808°283,5789 | 457a,)=— 942°176,2334 
Ng = 129 374a, = 3203 °644,5476 | my=204 | 457a, = 3281°101,5644 
ng=61 3740, = — 9271-:066,1366 | 23=57 457a3/= — 8958°636,6700 
ng=27 374ay'= 23389:468,5164 — ny=49 457ay= 22229-438,8090 
Ns = 14 374a;/ = — 69671:015,1599 | Ns; = 25 457a5 = — 66617'544,9966 
374 x qb (ay — go 43 + asz9 ds) = — 56°784,418, | 457 x a5 (a1 — gg Ga +a as) = — 68°275,096, 
374. X qhy (ae -— 7354) = —-18°962,592. 457 x45 (ao — 73¢04)= 19°991,508. 
Thus : 8374p)" = 2600°215,572, | Thus: ABT py” = 2527-224,904, 
374 po” =71077'370,741. | 457 py!” = 76222175, 159, 
From (xxxi) we have: | From (xxx1) we have: 
9626p,” = --7768'448,082, 9543.1” = — 7640-738,2653, 
or, 1 — py” ="1930, | or, 1 = py” =1993, 
9626p.” =6736°839,298. | 954342” = 6604'373,5073. 
Thus: 10,000) = — 5168°232,510, | Thus: 10,000p)/ = — 5113°513,361, 
~ and py’ = — 516,823 ; and py = —°511,3513; 
10,000 px’ =77814-210,039, | 10,000py’ = 82826:548, 666, 
py =7°781,421 ; fe = 8°282,655 ; 
or, finally | or, finally 
Mean = -4832 year, | Mean = ‘4886 year, 
Standard Deviation =2°7412 years. | Standard Deviation =2°8322 years. 


It is clear that in both cases the centring of those who die in the first year of 
life is a little under 0:2, instead of slightly over 0-2 as in the English data. It is 
worth while inquiring what the effect of concentrating the deaths in the first year 
of life at 0°2 and then simply determining the crude moments will be. We find: 


Concentration at 


| neentration at 0°2 : 
Concentrat actual centres * 


| Complete corrections 


Male Female Male | Female Male | Female 
Mean | 496 | “496 489 | -495 | -483 489 | 
Standard Deviation | 2°729 | 2°821 2-730 | 2°822 | 2741 2°832 


This process then gives quite a reasonable value for the mean and standard 
deviation. Thus all we have to do for a rough practical value is to use the pw,” of 
the first equation of (xxxi) to obtain the centring of the first group and then find 
the raw moments only. For a very high first group this is considerably better 
than applying our first non-asymptotic method and of course better than mere raw 
moments. The following are values found from year groups : 


* That is at ‘1930 and -1993. 


Ist Method of this paper ~ Raw moments 
i a = ee Wee 2 = | ; 
Male | Female Male | Female 
| Mean 609 610 592 «| «~ *592 
Standard Deviation WTS} e282 2°72 2°81 


The means are inadequate, but it is remarkable how close the standard deviations 
are to the corrected values. 

(14) The reader may occasionally be puzzled to settle whether a frequency 
distribution has really a finite or infinite initial ordinate and therefore be in doubt, 
as to whether he should apply the first or second method of this paper. Our 
[lustration IIT may be taken as a possible example of this, although the ex- 
aggeration of the first frequency is nothing like so marked as in the case of con- 
genital malformations. 

If we apply the first equation of (xxx1i) to the first three days’ period we find: 

O—3 days n, = 18°25 
3-6 , m= 658 whence 18:25," = — 14:057,688 


> Ts . 
6—9 , n= 789+ OF ele 
or remembering our three days’ unit, 


9—12 ,, ny= 5°65 
12—15 ,, n= 5°82 Mean = ‘69 day. 
Our table now becomes: 

0—3 days 18:25 centred at 69 days 
5-6 6°58 45 
6—9 ” 7:89 75 ” 
Caos. 5°65 105, 
12—15 ,, 5°82 135, 
15—1 months 19°80 ‘75 months 
oe 22:59 15 “ 
Cae 18°58 25 . 
- 8 15-96 3-5 i 
i 13°30 45 . 
56 11-51 55 
6—7 9 10°61 65 ” 
7—8 ” 9°30 75 ” 
8—9 ss 8°74 8:5 Pe 
9—10 ,, 8-29 9°5 6 
10-11 ,, 751 10°5 ‘ 
Neto 6-94 115 S 


Total 197-32 

Hence by raw moments we find : 

Mean = 112:98 days as against 114°61 days, 
Standard Deviation = 105°53 10715, 
found by the first method of this paper. 

Here we have not used our full second method but the results are in fairly close 
accord, especially in view of the fact that we have not corrected for the curtailment 
abruptly at the end of the 12 months. Accordingly the suggestion made is that 
in doubtful cases both methods will give fairly closely the same values, and therefore 
we need not worry over which is the more correct one to apply. 


”? ” 


PECCAVIMUS! 


This paper is devoted to a number of slips recently made by the Biometric 
School and which it is desirable to correct at once, before the formulae which need 
correction pass into general use. Some of these slips are due to war haste, others 
to neglect of terms which ought to have been included in our approximations and 
some to printers’ errors. We have to thank Professor Tchouprotf of Petrograd fo 
indicating the existence of several of these mistakes. 

(1) Biometrika, Vol. x1, p. 215. On the Probable Error of a Coefficient of 
Contingency without Approaimation. By Andrew W. Young and Karl Pearson. 

Down to p. 222, equation (xii), this has been again checked without discovery 
of any error. But on that page the authors “take J7 to be very large compared 
with NV and make y, = y.= xy; = X,=1” by an oversight. The values ‘of the y’s are 
given on p. 217, equation (vi), and clearly when M is very large compared with J, 
Xi = Xe= Xi = 1 and y; = 1 — 2/N. Accordingly equations (xii1) and (xiv) of p, 222 
for samples from “an infinite population” require modification and should be* 


oa v [5 (ares) - {8 Gor 
: AS ‘ Ng 
+ ye [98 ()—#8 (i) 8) +198(s) 228 (ye) 
+m |8(a2)- GDF - G2) + 8G) 8) 
+ 88(phs) — 618 (Hx)p 


a 
—~ 
‘a 
ee 
nr 
=) 
— 


eo [fE)o-9] 
1 


+(e (Ce) ro—sons(Q) 088) 


s 


rs 


—(2—4¢") ¢ — 166? + 106! — 2] 
+m SGE)+8@)- {8 Gay -68(82) + 98) 
ae (f) (2c — 4? + 8) — 6h! + 12¢? + deg? — 0? — Qe + 2| 
ae (xiv). 


* The changes due to x; affect the term in 1/N*, but the original (xiii) has a wrong sign to the third 
term in 1/N2. 
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We may now turn to the numerical illustrations. It will be sufficient to show 


the correct values of oy: in the table on p. 224. 


First and second terms Z 
of (xiv) All terms of (xiy) 
Old Values 02709 3 02729 
| Cerecte Values 02725 02744 


For practical purposes, these would all be taken as ‘027, and accordingly the 
.errors, although sufficiently distressing, do not modify the conclusions, that for 
a sample over 1000 the first and second terms of (xiv) are adequate. In the second 
example, p. 227, more serious changes are made, chiefly owing to the error in the 
sign of the second-order term (2 — 4°) c, which becomes of greater importance now 
that WV is reduced from 1801 in the first illustration to the 218 of the second 


illustration. We have for oy: 
First ar terms All termssof (xi¥) 
Old Values ‘0798 0823 
Corrected Values 0693 0719 


Thus for practical purposes the ‘069 of the first and second-order terms is only 
raised to ‘072, if we include the third-order term. We may therefore conclude that 
250 cases marks something like the limit at which we need to consider the third- 
order term as well as the first- and the second-order terms. 

We now turn to the test for zero-contingency. Equation (xvii) of the original 
paper is correct, but the wrong value of x, was inserted to obtain (xviii); 1t should 
of course be 1 —2/N. This leads to 


c(e —-2)—-2(e—1) 


oy = z ie 4p N +2(c— ih Benen (xvii), 
or perhaps as it is better expressed : 
sgh Al ee I ce ne 
On = 773 8 ia +2 (1 _ x) (c-—1)— mt Laat ueee (xvill) bis. 


The formulae summarised on p. 229 must be altered to accord with the results 
(C) must be (xiv) of the present paper. (D) must have —2¢ and 
(C’) must be (xviil) above. 


given above. 
not + 2c for its last term. 


(II) The object of our next note is to make some additions and corrections provided 
by Dr Isserlis himself to his paper: “On the Conditions under which the ‘ Probable 
Errors’ of Frequency Distributions have a real significance ” (Rt. S. Proc. Vol. 92, A, 
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pp. 23—41, 1915). In that paper he gave the values of the frequency constants 
By and B, (formulae (19) and (23), pp. 30 and 31) of the distribution of the 
moment-coefticient of any order wu about a fixed origin for a sample of size x drawn 
from a population of size NV. These formulae are exact and no alterations are pro- 
posed here in them nor in any conclusions drawn from them. In the latter part of 
the paper Dr Isserlis deals with the value of the 8-constants for moment-coefficients 
referred to the mean of the sample. These latter values were approximate and 
intended to be correct to terms in me We are indebted to Professor 'Tchouprott 
for pointing out that there is an error in the approximation, for one of the neglected 
terms rises. When the correction is made, however, the statement (p. 24) remains 
true that “for coefficients of high order the sample has to be an inconveniently 
large fraction of the population itself if 8, and 8, are to approach even approximately 
their Gaussian values” (i.e. 0 and 3). The results in the paper cited are exact and 
correct * until section 5 (p. 35) 1s reached. In that section, formulae (38), (39) and 
(41) are approximations and for the purposes of the paper should be given correct 


ee UL agli rs ; 
to terms in . for (88) and to terms in a for (41). The use of the incomplete value 
U Cm 


lee ; 
Ci 99 (ig Neg) 2 Uh gp Le 
i 
in equation (37) has introduced an error in the value of M/, given by equation (39). 
We proceed to amend this error. 


Ih Pe i = e : 
We have fu => S in, (a, — %)"| =—S(n,X,"), dX,=—dz; 


n 
. = 1 vi 7 U 7 U—1 fm} 
er tle FF S {dn Xs — ung Xda} 


il a u(u—Il : _ 
+ rf S '- udn aa xX + ae =) Ns X dit} +... 


= 


= A+ 6+ terms of third and higher orders in dn,, d@. 


Now it is well known that the mean value of fifth and higher powers of dn,, 
dz, ... contains no terms of lower degree than the third in I/n. 

In the formulae (38), (39) and (41) the values of M,, M,, M, were obtained as 
the mean values of A*, A* and A‘ respectively. The inclusion of the neglected 
terms does not affect 4/, which is given correct to — nor M,, for the only term of the 


n 


fourth order in dn,, d@ in (A+ B+...)!is At 


But (duy)? = A? +3A°B + fifth-order terms in dn,, dv. 


* There are some obvious printers’ errors overlooked in proof, of which the omission of the factor 
(M’u)* = 2p’ 2, (uy)? + (vou)? In the first line of equation (21) is most likely to mislead. It may also be 
noted that the factor pv? is missing in the first term of (26) and the factor 3 in the first term of (41). 
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Hence the correction to be applied to the value of M, in formula (39) is the 
mean value of 3A°B. 


Now A= ; S (ding X x) — Uplry de 


and B= . aS (— uX,""dn,d&) + — CA — [yo X, 


so that 


A? =— “(dng Xe +2dn, dn, X" X ¢") — aU oS (dns GX 5") + ry ade. 


Let us write AA=L+M+N and B=H+K. 
pee ne tie wadnsdx) +S (Xe X "I dnedn,dz) 
nr WS (X POX dngdn,dx) + 28S (dnsdnzdnydt X UX 1X p" i 


Denoting the mean value of HL by HL we have (see (31), (32), (33) of paper cited) 


HL = - = bie su X X |, fe = ") ae 6 nx! 
Ns Z ats s\n 
eee = Ge a4. 9 Do Ne 2 aN mic -2) + ey x,-(2"9+8)x,|t 
= 28 fag Ue yt Net Xp) WON co | 
ae ux 39 \ ("= fe we) Xx? BUY S {(" gM a) (X au Yu at aX 2u-1 VY “ny! 
ia ae i\ a nw 2 oe c : 
— oS { a) 7 gut1 VY u— 2X ou V U 
28 1( ie (X;5 Xt + X 5 xX; ) 


298 ee OG ER CeD OY XOX," | 


=e uXp S 2s x a 
n* { : 


Ww 


8 (25M (X ou xX ut Ox aul Y wt Xe sut1 Y vo ie OX eux i 
\ ne s t s § 


= UX [- DIY (ng X Xx ‘| - S {resis Kea XG 4b OX xy} 
{ n2 1 (fs 


he 


Neat ae “ oe es 
= Ss. ea (2N eter pare AX 2" X | 
9 Ns “~ IG. Gu Xp wy Nutr Xe You 
— 28 3- (3 at 4 als alt Ap ) 


+35(" s ee 2) ms ee LEN Es Yt OX’ 2u— xen} | 
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n 


5 uy in Ny? — 2N fu bu busi | 


a + feu eu + 22 fouaMuts 


&: UXb E a S {ss gilt ( Xs 2u Xx ey) X MAX 1 = 2X gu X _ xenxey| 


ns 


uy db’ ; 
a liteaey ar 2 oui fluti — Pou hu — Powter | 3 


5, eter lain ap lat 
or mean value of HL* = = | ie 4 fg ee Cees ) 
Bou bu — seu buy 


ee e (Hau oF 2 fou—1 Mu — Pou bhu — Honsstn-o)f 5 
= PTR IVE eo Y({ V 2u-1 22 Y(Vuy u-1 , 72) 
HM = le [s tNeg dng dx } = Xs a\ ¢ dn,dn, dx Hs 
12 


so that, using (34), (35) of the original paper 


HM= Seo [spen x [m(o (1-7) +f) exeQ0% +9) |; 


+8 1K, Payer [* (2N..Xy— ps) — (Xs — Xi) “lh | 


PAN [Viegas <i Wily We) = ie oe 
———<— Bis} (* _ = | eS Ny pea 
n? XP | Ms | wn As ne 


+ 28 pean jto8 {2 Me Xu XY 4 
5 7" 2 

2? pur b ee eae 1 Sts y Ss 

BRE aS [ms eacent + 8 fae 


7 ef Me (Xe Xe ee Xa, | 


= arr vaaie ¢ x|¢ (Ms Mou — Pe huPu-1 sr 2 fu uti) 


ote e (Me Moura F Mowt — Mua bute + bubuti | . 


Again HN =~“ §(X,"-1dn,dz", 
so that using (36) 


ava _% eet E {xe (3x6 © fly X 5 + x" 5 (X+ 3p X 5 — fs) | 


UP wey, ; 
aa co x Spin + 


/ 


i e (Mute + Blo flu ma 7 ) 


Ky = UMD 2g (x dn pda} + 25 (X,Y X,"dn,dn,da]. 


Qn? 


* It must be remembered that @ and e are of same order in = Cf. equation (5), p. 26, lc. 
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Therefore, using (84) and (35), 


= ud 1) pu» ui (eat ( Mey ¢’ b eioe, ies BG. 
en Qn? [s ios n X | U2 ae es ae apes (267 + n ) 


+28 |X Xety “= wt | (2X. Xi— m)- (X, -xy®]} | 


_t (u— | dies \( _ **) ». Fe just XX. a 


Qn? n® 
bs (2M xorg MM xn xe) 
tee 2 | as | ree ON ‘| +8 ia xX 4 


_ 99 fee s Mt ( XG ut2 XG ye DX tt Xe a xoxo] | 


u(w — 1) Hu» : ; 
= In? J xX cE (Ms (fu fic) + pay vier 


Om 2 ) 
Sheets { ts Mou + Mowpe + Qu ad 2hubutes > 
? ‘ 


= Qu ; 
fo OS ie (- tie) S {dnd X,"}. 
n 


a 


Therefore, by (36), 


nN 


Ray = sf: "(3x6 SmoX typ (XE + Bu X, ~:))} 


ie eS 


We 


Oy 


x oy. » uta qa a 7 (us ar Soe Muy — bp} > 


(WD) ube aps 
a) en JA > 


= 


iN 


and using (26) of the original paper therefore 


Oh —1 Pb U—-2 | « = Y : = : 
Mean ee ieee | Sxbe TX. e (Ma + 34) 


2n? 


Adding these various terms to the mean value of M, as given by (39) of paper cited, 
we find for the corrected value : 


M, = XX. [ 2p? — 3puflou + Psu — SU fu (Pow a 2 pout plu) 
no 


+3? wWua (Huts om Hs [Lu) —w (ua Hs) ] 
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3x 


n> 


+ 


{@ (ot ae Zhu pu but — Pou hu — 2 fu+1 four) u 


ar Qu? ur (Me fou—1 — fy hua bu + 2Mubuss) 


— = u(u—l , é 
Wea (3 py flu) = = bu» (ls flow = Pole + 2 rut) 


5 , w(u-— 1) , ee 
— ww 1) puepua (Bho bus) + —_ Mu Ku-as (3y2)| 


a e E u (Msu ae Diba Mou — Khu Mou — hu-1 four) 


2? pu (Me Meu + Pewter — Hui Maye + Pupusa) 

= Wu (Muse + Sf 2bu — sur) 

+ a Hu—s (He Mou + bouts + 2p up — 2bu Puy) 
— UW (U— 1) fur PMu—s (Muts + So eMuti — Ms Mu) 

Re Soot) }. 


On ru Pu—2 (uy = SLs") 
and (40), (46) and (47) must be modified accordingly. It remains true that for 


a 


: f : : | 
a normal population M, vanishes when w is odd, and that in all cases B, x — . 
VL 


If we write this value of M, in the form 


 ’ ByfK  3y¢'T 
gate eNO Ge NEE 


n* nv nv? 


d 


then in order to obtain B, correct to Li , the third term may be omitted, R has the 
n 


same value as in Equation (47) of the paper cited and XY is zero for normal distri- 
butions when wis odd. For even values of wu in normal distributions the value of 
Kis 
u(u—1) F 
uw (Hu? a Mon Mu) cae oat bu-2 ( [2 flow — fo fty”), 


which easily reduces to —4ux P x wy, where P= py, — 2 as in Equation (52). 


We may therefore add to Table I on p. 39 of the original paper the following 
column : 


| U K 

| ——— 
2 — 2py? 
33 0) 
4 = 576p2° 

| 5 (0) 

| 6 — 457,650 py! 


Biometrika x11 18 
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In Table I on p. 39, the third column is unaltered, the second column becomes 
(the coefficients of ¢ being approximated) 


U By, 
2 SOTO): 
2 x : 

3 0) 
4 LL eA 

n x 
5 0) 
6 1099 (y’ — 0-040)? 

n ee 


The corrected form of Table III (p. 40 of paper cited) is now as follows: 


Table IIL Approaimate values of 8,, 8, for samples of 1000 out of 
a population of 1,000,000. 


| u Bi By 
2 0-001 3-012 
3 0-000 3-090 
| 4 0-081 3°204 
| 


Thus the effect of the correction is to change the values of 8, for u=2 and 
w=4 from the values 0:008 and 0:102 to 0:001 and 0:081 respectively, but it 
remains true that the frequency of the fourth moment-coefticient differs appreciably 
from the normal distribution. 


(III) Dr Isserlis also wishes to make the following emendations in his paper 
in the last number of Biometrika, Vol. Xu, p. 134. On p. 138 near the foot + ABC 
has been dropped from the bracket (8FGH + 2A F* + 2BG? + 2CH?). Also in 1. 6 
of the same page for “on Q” read “and Q.” 

(IV) The point indicated by Professor Tchouproff, namely: that fourth-order 

: ers ; 

mean products are of the same order finally in Ws third-order mean products and 
cannot be neglected therefore in comparison with third-order mean products, is of 
great importance in investigations into the probable errors of frequency constants: 
in the case of small samples. In expanding functions of the deviations from mean 
values of subfrequencies such as én, we cannot neglect products of the fourth 
order in the 6n,’s compared with products of the third order. In obtaining results 
true to products of an odd order in the “statistical differentials ” we must proceed 
to products of the next highest even order to reach correctness. 
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This principle, which is almost self-obvious, was, however, overlooked by Pearson 
in his paper “On the Application of ‘Goodness of Fit’ Tables to test Regression 
Curves and Theoretical Curves used to describe observational or experimental 
Data,” in Biometrika, Vol. x1, pp. 239—261. 


One of the objects of that paper was to investigate the probable errors and 
frequency distributions of errors in the mean and standard-deviation of an array. 
If we have an array of a first variate corresponding to a small subrange of a second 
variate in a sample of V, the law of distribution of the means and standard- 
deviations of such arrays when many samples of NV are taken had not been 
investigated at the time Pearson wrote. If there be n, individuals in such 
a sample, then the problem differs from the ordinary problem of the distribution 
of means and standard-deviations in a sample of size n,, In the fact that n, in the 
case of the array varies from sample to sample. Hence we cannot straight away 
assume that if 7%, be the mean number in the array then an,/Vn, and Gn,/V 27, 
will be the standard-deviations of the distributions of means and of standard-devia- 
tions of the arrays; still less do we know how far it is legitimate to suppose these 
distributions approximate to the Gaussian or normal type. As the problem is an 
exceedingly important one the writer asked Miss Eleanor Pairman to revise his 
work of 1916 by introducing where needful the fourth-order products. This she 
has done with certain additions and expansions. 


(a) From the equation on p. 289 we have: 


Wg Cole ONgpkHe [ONy\* 
mean (6m,) = mean & (—)* g fPat ae (= oP) ep at (S*) i. 


Np Np Np Np 
where § is a summation for every value of a from 1 to 2. 
= n 
But Mean SNopdNy = Ngp (1 —. ) 
and the regression relation is accordingly 


Se Ope 
iy 


gp 


Substituting this we see that every term vanishes and accordingly 6m, = 0, 
not merely to a high order of approximation, but absolutely. In other words the 
mean of the means of any array—notwithstanding that the number*in that array 


will var sequal to the mean of that array in the sampled population. 


(b) We have for the pth array: 
S (Hap + Sitgp) Xq 
Ny + On, 
S (SNqpNpXy) — S( (Tgp %q) )dny, 
Tip (Ty + Srp) 


My, + Emp, = 


Vin : 


but dry = S(SNgp) and S(Rgp%q) = Np Mp, 
1) 
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and accordingly 
SES eT sie : 

ce S {[Sngp (&q — Mp)! 3 S [Orqv%o} 

P ? 


Ny + Onp Ny + SNp 


where #, is measured from the sampled population array mean. 


Now we desire to obtain the various moment-coefficients of dm,, or mean (dm,)', 
which for convenience may be written {dm,'}. 

There are two ways at first sight of doing this: : 

(1) We may expand (7, + 6z,)’ in terms of 6n,/n, and then take the mean 
values of products such as: 
(Stig) (Onan)? (Ongnl (lap) aesene ; 

This was the process adopted in the original memoir. It is very laborious and 
the algebra so lengthy as to lead easily to slips. Still on the present occasion we 
went to terms of a high order (Oy) and some of the results obtained will be so use- 
ful in other investigations on probable errors of frequency constants that it seems 
worth while placing them on record here. The fourth-order mean products in én, 
and én, may be added to those given on p. 245 of the original memoir. 


They are: 
My 


il : - Ny ai aN lhe 
= > (On, ON on) = Nap (1 _ 7) i! +93 (1 — x) Nop (1 - *) ; 


2 a or 

— 2 (On, 82g Oty'p) = (1 = w) Nap Rap (1 a #) ( a a) 5 
2\ Nop Ng'p Nos n 

DY (dnp SNgpdNqpONg'p) = — 8 (1 ae 7) “pMeete'e (1 = 3) 


: 2 2, = = n 3n, 
> (Ong ONgn ONg'p) = (1 _ A Renton (1 — =) ( — =) ; 


> (np? Sigy) = 7 ( -%) 1+3(1-z)a (1 -"%) 
ee Olpxe hap ap N WN) NV 


a 


ed a a 


For the fifth-order mean products, Miss Pairman also provided the following 
values* : 


1 -  & Nay 279, 6\ _ n 
5S (Bian)? = Rap @ as “a @ _ = {i +2 (5 a y) ea - “| 


ieee Noo Noto 2Noy 
1 Begg) = Baa (1 — 28) 
es P : aa NW Bes 2 \ Ngo! 
x & (On gp ON 779) = Ting Mpg! {1 ar WV — (1 — 7) WV 


= 6) Rng Rog _ Noa’ - 5") |} 
(5-5) -¥-#C wi) If 


* These results are of course perfectly general, that is to say we can suppress p and suppose them 
the mean variation values of elements ng, mq’, Nqv, Ng ANA Ngiy Of any frequency distribution. 
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1 : Nop Na'p Na” 2 Oey ee 
< > (82% yp dNq'p ONg"p) = — a : eae (5 -- wy) V (3 = Wt 


1 Nap No'p Na" 1 2 ON in » + Nap ll Map ; 


1 6 \ hi, p Ng’ Ny 
x & (S22 gp SNg'p ONg''p ONg"p) = (5 - 8) 4 hy bat) (1 - aie) 


1 6 \ Ray Nop Nop Rgrtp Ngiv 
=, & (Orgy ONg'p SN gp ONgpONgivp) = — 4 (5 a, P ae vere 


Alongside these we give the fifth order combinations of én, and 6n,,, which are 
deduced from these and were required for our purposes : 


1 ue Ny 2» 6 \ No» ie 
5 3 (Bp'8iqy) = Niigp (1 = re) (1 = FH) 1 +2 (5 - 7) eB (1- Wt | 
1 3898 = Ny § = Ze 
i > (6n, 7627 4p) = Nop (1 — 7) is —Ny+ (1 - a) Nop 
= he 

1(6-8)(- Ferre) 
1 ni 2 = 
x SON; OMep ONgin) = Non tan (1 — “*) |! iar (5 - 7) (1 - 2) (a — | : 

n, 2 1 
DCO, On on) = Nap ( — z) {i dt (1 = wy) ni, — 6 (1 —= a) tind 
3 


( _ "2 a (3- ae Nap 

NN] WT) J’ 
eo Np Nap 4ny\ 

= (5 7 Wr) E “NW (3- ¥)| 


1 P 6 \ Ngy Noy Ng» Ny\ [. An, 
5 3 (80,28 gy Bry Bry») = — ( pas 7) Taeyp tye (1 — 7) (3 - Ww) 


1 4 _ Qi» n SO Nay  ) 
1 2 (On, ON) = Ty ( — =) (1 — 7) i! +2 (5 - x Nop (1 — al ; 
ie a iy 2 ONG ep 
i YX (Snp On 8p dNg'p) = Nap Na'p (1 - z) \(1 — wy) — (5 ~ 7) V (3 - 7) 
J 
| 


1 n 
= 1 2 2 a Pp 
i= (ONpON 9p Ol a5) = Nop Neip @ = 2) 2 


6 (Gs + Tap — Faget 
N Ve. ye 
n 


; nie rere Auf. 06 Aion 
> (S12, 6274) ONgipONg'p) = — ere A = 7) (5 = x) (1 =- i) . 


Nop Nop Nay N / n 6 
Ss _ gp i : gp Dp 
> (ONy ONgy ONg'p ONg"p ONgi'p) (1 - 7) 4 (5 wy) 


ye ele 
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Also we give here additional moments* of the binomial (p + q)”" about its mean: 


fu = 9, IVE, Ms = npg (Pp — 4), 
py = npg {1 +3 (n — 2) pg}. Hs = mpq (p — q) {1 + 2(5n — 6) pa}, 
fo = npg {1 + 5 (Sn — 6) pq 1 — 4pq) + 152 (1m — 2) p*g?}, 
= npg (p— q) {1+ 4pq (14in — 15) + pg? (1052? — 462n + 360)}, 
fs = npg {1+ Tpg(17Tn — 18) + 14p?q? (85n? - 154n + 120) 
+ Tp'g? (15n? — 340n? + 1044n — 720)} 


The values obtained in the above laborious manner for {6m,*} agreed as far as 
we proceeded with those obtained by the following or second method. 


Gi) This second method consisted in first summing for én,, on the assumption 
that én, was constant, and then summing for 6n,. This involved some new results 
which will be useful in other problems and are recorded here. 

For constant 7%, + dn, : 


Ole \ Nae n°, 

= 9 y _ -_ m4 

Mean (6n,»)? = (1 + =| =a (Tip — Tgp) + =* 2 OF ee 
Pp 


Np / 


ON»\ Nop Nay . Raph 
‘P qp 7p Gp" IP 2 
Mean (679, 6nqp) = — (1 +—— ) ee + bn,’, 


Np Np Ry? 


dn,\ 7 _ 2n n ne : 
Mean (SNgp)? = (1 + =)! “qp (7 ity =P (eon “QP + 36) i ED, 8 2 bn,’ 
D 


Np / Ny Np Ny Np 


8n,\ NgyN n n Neg No! 
Mean (614, d%qp) = -(1 de =) peel (1 — dn, —2 a + 36n, a) 4b aa én,, 
Dp 


Np Np 1p 


Mean (Ongp ONq'pONq"p) = te {(n, + dn,) (2 — 36n,) + 8n,*} 
p 
Now the value of this method was at once obvious, for proceeding to the sum- 
mations in the moment-coefticients of dm,, for constant 7, + 6, we found that they 
corresponded with, values to be found for the distribution in a sample n, of constant 
size. In other words we reached a conclusion, which should have been obvious at 
first sight, namely that to find the value of 


Mean (6m,,)$ 


all that we have to do is to write down the known value n, =7, + 6n, constant and 
then sum for 6n,. We might have pulled down the scaffolding in this correctional 
paper and simply started from this result, but as several of the means reached in 
processes (1) and (11) seemed likely to be of value, we have preferred to indicate 
the steps which led us to the final method. 


* A simple reduction formula for the moments of a binomial about its mean was sought in vain. 
After a good deal of energy had been spent on the problem, we believe that u, being the sth moment 
about the mean 


Ms= E (qe?* + pe)" | 


is, perhaps, the easiest expression for reaching these moment-coefficients by successive differentiation. 


2=0 
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(ii) For a sample of constant size n, the following are the moment-coefticients 
of the variation in the mean* : 
2 pe _ o'n 
(Buy!) = 2H =, 


Np Np 


1 1 F : . 
Now let 8 (=) equal the sum of —. for all values of p which may occur in the 
n, n,' 
p p 
samples of V. If therefore f,,, is the frequency with which n, occurs, the whole 
problem reduces to finding the values of 
S (frp! Mn*)s 
for various values of s. : 

Now the frequencies of the n, are simply the terms of the binomial. The term 
in which »,=0 must not we think be taken into consideration, for in this case 
there is no variability in yw,’ as there is no frequency in the array, Le. ,“. must be 
put zero. Thus in the notation of a binomial (p+ q)" we require to find: 


mo dyn lpg | na) i — 2) pr ge r 
al ms i Tons ge bee seen (F) 


and to divide the result by (p+q)"—p"=1—p”. This finite series we have not 

succeeded in summing. Before indicating how we may approximate to it by the 

mean-powers of 6n,, we can look at the problem from two other standpoints. 

G) If n,/N=q be not small the binomial approximates to a Gaussian of 
standard-deviation squared o? = npg =1,(1—7,/N). Hence 

+a 

s( 1 ) 2p re! : 

b 4 Cad —— = SENS é 

Ra VQaraJd —0 (Np + 2) 


ieee 2 La? 
= ae ie (1-32 4 SUH e308 dia: 
Vario -x Np LED ny 
ee ee rae ee Sia ras) 9 ) 
al TT.2 aig 1.2.3.4 wa) 
Se oe aay aI es eae é 1 ' | 
a eee ieee 2 (eae 
Thus S a. = le Ae w)t i, a ; 
ee AK 1 1) é i ) 
¥{—) ==, 41 i 15 (—=— = Sees eee G 
Sie) Tip? +3(— w/t My Ov bs \ i Co 
iy cect. ie 
ery oa rs ae pee ee a 
: a) ieee x)* Gs x) + j 


* Here ,u, is the sth moment-coefficient of the 7, array about its mean in the sampled population. 


rd Peccavimus ! 


(i) Another method is to assume a Pearson Type III curve: y= ya*e7%, 
which is known to give a better approximation than the Gaussian to the binomial. 
We assume it to start at the beginning of the first subrange of the binomial and 
to have the same mean and standard-deviation. These conditions involve 


a+1_— ai ( se 
¥ p> WN yy 
; ffl L ee ee OP gael 
Accordingly S leak Yur *e BI geree CieaL) 
where A = total frequency = aa D(a+1). 
Hence 
‘ie es eee i 
8 Ge) a it, 1 (= -x) A 
Np VN. 
s(5)-—" ils fs Sl 
ne) a(a—1) Fig? a iy ; 2(] 2 a , 
fin, IN) ale SUNT NG 
(=)- aes ee 1 
ne) a(a—1)(a—2) ii? 1-(5- xt Wi eC - 5) 1-3(2 a , 
n N/} fp. eal Tine Ne 


which are exact. Or, approximating 


Gat +G-w) +m) * 
nD ee 


SG) “al 
Ga 


If, 
a es I~ 1 
es | eee 
, ey aaa G x) : 


Or 


V4 Tye eel: Wea 
Pan il : 
Both methods agree to the terms in (— a , but the Gaussian appears to exagge- 
) 
P : 1 LN: 
rate in the terms in (— —-—;}. 
iy . av, 


(iui) We will now proceed to approximate on the basis of the moment- 
coefficients of the binomial. We have 
1 ae! (1 3 Ory) , 88 + 1) S(Onp*) _ eae G2) On ). 


U 
Ss a = 
(ip + Sny)® — Ny’ ite de es 2.3 Tas 


: u cere 
Here S(én,)=0, and we will keep terms up to the order = Asis which involves 
p 


proceeding to the fourth moment-coefticient. We find 
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g 1 a9 ie eee =H \e ee = 1 - 
(Ry +Snp»)® ip" 12> Ag, © av 1) eS} n, NV. Gorn 


p 


toes — eo) tw) Ge) al 
_ s(st1)(s+2)(s+3)(s+4), NA at ae 
2345 Oe oy ( 7 


ee BNE ee ee) IN Se 
1.2.3.4.5.6 15(5 - y) ete. | 


as far as terms of cubic order in the curled brackets. 


Hence we find 


Qe iae-DOrhed | 
s(.- HGH) 3—} | 
S(,)= ae “Ga (8+ 5+3p) 7 a 


z ny 11 Ee: 0(—- 4) + | 
GAY ueB)ata(t Aye 
] 1 il 1 OS 
1 1 30 1 Ne 
le at) (7+3))+205 (2-4) +..}. | 


E : 1 i : 
It will be seen that these values agree to the first term in (= = 7) with those 


Tipe 
given by either hypothesis (i) or (11). For the terms in (= —- a) they appear to 
p 


be intermediate between (1) and (11). os additional terms which do not oceur in 


either (i) or (11) are those in powers of 7 in the second- and third-order terms, 


Using these results we deduce : 


pM, = mean (dm,/P 
wha (22) (ada dea(t 1) (048) e6(2- 2) 
a, f+ (- w)+y+m)+ 2a a) (+9 le iy 


Thus the probable error of the mean of an array of mean size 7, in a sample 


of N is: 


on a(=- ¥)( 11 5 (5--y) 22 (= —a) 
oragg Fe hh 45 (5. N Let al enle N (Oar) a = ae 


ip 
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Again 
_ pls = — 5) 4 >a) (4 Pt 40 (de er 
ol ae y)3+yt mt — x 11 +57) +50 ue 
poe hE ee (K), 
and if t= i, re 


M;* _ »B {1+6(84+ 3+ yy.) +e (11+ 5p) i soe 
pw ell ofl ON wae) 
P pe Ty i+ e(l4+ qty 5) + o(2+ x) tot 


which after reduction gives 


_pPifya fg. 5 wm) (=~ xt 56\/1 1) (|--+) 
i + (3+ ay + ye a, + (13+ 5) - wy) +69 A ie 


Pp 


Clearly if ,@; were as large as 0°5, ,B, for an array of 30 would be of the 
order ‘02 and thus the array means have an approximately symmetrical distribution. 


We now turn to the value of ,B,, and find with the same value of as before 


sy — Byte? 10 15 , 150 
pM, — 3,Me =* ae {1+ (6+ 57+ + yn) St (35 yr) 8+ 2256} 


i [+ 5+ w+ (645 v)¢ + Mer 


my }y -(24+ 54 qp)o-(14 q) ye 4g 
are WW N 


lien the previous result by this we have 


yBo— 3 { 8 1153 Il ae aie? ae ey 
pB,— 8 BS 14 (44 Ht w)(s, x) * (2+ 9) G,-y) 


Pp 


Further, (,M/.)~ 


1 1 2 3 5 1 1 il 1\? 
oe Ga nity +a (14 Saal ule =a! a Se 


For example, in an array of 25 in a sample of 1000, if ,8 were as high as 3°8, we 
should have ,B, slightly less than 3:2. Accordingly the constant ,B, of an array is 
not as approximately normal as ,B,, or, we have the material thrown out further 
towards the tails than in the normal distribution. 

It is probably, however, adequate to speak of the means of an array of variable 
size as roughly following a Gaussian curve and give the usual meaning to the 
“probable error” of the mean of such an array. Its value however is more accurately 


—, than 67449c%)/V iy — 1. 


67449 Ve 
> P 
\ Np 1 + WV 
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od 


We may adopt a similar process to find the standard-deviation of the second 
moment of an array ina sample. For an array of constant size n, we have 


Mean (6,2)? = a ~ (1 ==) \(1 Sees (1 =) ahes 7 


where ,m, and ,. refer to the values in the sampled population, Thus 


a ae 1 
Mean (8pp1)? = 2% @ 7 —) {0 = =| (82-3) + 24 
p 


Np p ie 


and, summing for all values of p, 


1 1 
2 ——e ped 2 3 a | eae 
Ope = ple | is ee 5 . 


Accordingly we require to find S (— ) -S (<=) and S (=) —S (5) . Writing 
49} p 


‘ie ae 
1 1 1 

€ as before = — — =, we have — 
fp LY n 


2) + (p82 — 3) {s (2 —28 & + § (=) 


Pp 


1 : ‘ 
ae WV and after some reductions 
Dp 


’ : y : = 4 f | 2 3 | _ 9 5 | 
es nied aed 
aya _ os 2 
Yo jf), “Sues _@ ae a ,) 4 Dt 
Accordingly if ,4 be the second moment of the n, array in a sample of NV, 


Pont [fy 2 (24 8) (21) (4 8) (11g (22 
He a. [ft N Gea lee N. a) Tees i =~ 3) | 


Ny 


+49 {0B (14 B+ BYE-B)- (Oe) E-B 


; ; Dalle ae 
It is usually given the value lnc , and further the assumption is made that (66 n,)° 
n 

Pp 
may be neglected in 2C ny OT np “F (Son) = 6,Mo, So that we obtain the value 


—ONp 
vv 


OP fee ac CR ie eee aero ee (O). 
i ~ i, 


Now whatever may be said for this result the method by which it is reached is 
distinctly defective and this not merely because it assumes normality. We have 
in fact for any distribution of size M 


o=V tty. 


Now let us measure o from the mean value @ of the sampled population and ps 
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from the mean value ji. of zw. in the samples. Then fi, will not be f@, but be equal 


to (1 - a) fo. - Accordingly 


M : ae 
G¢tode0= Fy 1 = r) 
o+00 = ( = H) be + Of. 


1 1 5 
or expanding 80 =G4— IM (1 +a + apt sam) 


1 dp, 1 3 5 
eis (1+ oy tam rem) 
1 /op5\? 3 15 
33 i) (1 z out am) 
5 ly x 5 5 zy) 
Ray peal = oe py g, mace nia.: Py: 
+76 (E) (1+ a) -128 ) + es 
Now we need first the mean value of 8¢ and for this purpose require the mean 
2 : ANS 
powers of Spy These will be about the mean value of 4, n samples, 1.€. (1 — ii) ps 


and are*, if we use curved brackets to represent means : 
(22) = {ey ee — 57) Be- az ( _i Fy) 
(GE) =9 IGE) = La - ) Bae i) (1 an) 
ea = |, = 98,=68,42)— (99, 218, — 188, 226 

= ~ WP (B14 — 3R2 Myre CPs Bo Iiseaee ) 


+75 @B— 338. — 228, + 54) + ea 


{(%2) | = ps [8 B= 1+ gy Bi 4B. — 15 248, + 488, + 968, ~ 30) 


yeas = = oo _ a 
— qe Be — 408, — 548." — 968, + 3368, + 5288, — 306) + + 5d (OQ) 
where as usual 
Bor = Por+s/ fs" *, and Borts = Hoyts X Ps) ps 
and have reference to the sampled population. 
Substituting in (P) we have 


Mean o =a + {60} =a (1 — do) say 


. oe ie ee Sh ae ae 
=o E — gay (Bo + 8) + zag Gps (BBs — LSB? + 148. — 488, 55)| FR 


* The value for \(2)} is well-known, the two later values have been recently given by Professor 
7) 


Tchouproff, Biometrika, Vol. xu, p. 194. 


EpIvrortau DATE 


In the special case of a normal distribution this reduces to 


3 (oil 
Ne rentteesery | eee et ee. 5 aren aaunages 3). 
cg =| 4M 32. | -) 

This agrees with the value given in Biometrika, Vol. x, p. 526, Equation (xv), 
which is now generalised in (R) for a sample from material following any fre- 
quency distribution given by BB ep eatelyere 


We must now adapt the an: (R) for the array n, of a sample of size V. We 
1 al 
need only to replace Mu and Fa by S (;,) and S (aa) of our p. 273 and retain up to 


Up 
terms in 1/n,. We find 


1f2+8 1 i a es : 
Mean Onp = ony | ea 8 Tp @ i Hm) Aine 1 28 7 = (SB, ae 15Be — 26, -— 488, — 103) 
ho ee Cl). 
This becomes in the case of normal frequency 
3 31 . 
Ny = On Se rsa ial onal |g aia eteieraueis.cieieverestys T os 
Onpy = ONp E diy 32h, wee eeee ( T yi 
We can now find o, from (P). Subtracting the mean value {dc} = — GA, from 6c 
me 1 1 
we have, if A, =Av— am * 3M 
ice oe lnops il 3 
Bo — {80} = 7 !ro +5 ils (+55 +san) 
Ie / Opts \? 3 
~a Ga) (+H) 
1 /Sp\3 5 us) 
eal ) = 55 iis ot Rance (U) 
Hence squaring and taking means we find 
ss ae tl Epa (Loy “1 1 ae 1 
Weg? | he 2 = Ne te dat. aes? 
ae 4 | 4* sen es b+) 
Ie ous 2 Spe 
-aiGe)} +a) aCe) + | 
Ep eat 48,—7B.2+108,— 248, — 23 
=a “4m 7 32m! By - Bot B.- Bi - 3) eee Tee ee rice em (V); 
oe ge BN Be=1f,_ 1 4B,—7By + 108, — 248, - 28 = 
— Gh eee (W). 
For a normal distribution this becomes* 
o il 
Fear (1 — ai)= G i 5 MCE <etasretacon pees (X). 


* The value is in agreement with that given in Bithietrih, Vol. x, p. 526, Equation (xvi). 
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Formula (X) shows that even for small values of M we have a value only slightly 


less than the usual —2— 
V2M 


Lg ~ 7 ype A ral = S ava = 
Turning to the array N, in a sample of V we require the mean values of 


S (~) and S$ (—). 


Np Ny/ 
L\2 I { 3/1 1 5 65 /1 long 
= Gah 8G-M0+ a) BGP 
Np Vi, | a\i N. "Givi + 798 Mp eeaclN, eee 


T\. 6657 “12 
al (1 een Gad eed or 


2 ie 
s(—) = eee ee 
Mp) fipVip 8 


Substituting in (X) we find 


Son = ip B.— 1 G = wee —- I 4B, - TB? + 4 Zia 242, a ab ) (Z) 
Pp 2 Vii, SN 16%, (ea ese eee 'e 
For a normal distribution we have * 
Sony = Bt. (1-24 gh) 
Pony a Vn, (1 8N + 2n, oe is Sisiete’ahvieleie-s, sjavenesetovarslece (AA). 


Thus the usual value oii,/V2n, will be about 2°/, in defect in an array of 25 in 
a sample of 1000. 

It will be realised therefore that if we do not take arrays of less than 25 in 
samples of 1000, the usual values of the mean standard-deviation of an array n, and 
the standard-deviation of these standard-deviations will not lead us badly astray. 
We have finally to ask what degree of weight we can give to the “ probable error ” 
of this standard-deviation, i.e. to 674495, This is only determining how far 
ony follows the normal law of distribution, that is to say, how nearly are ,B,' and 
»B — 3 zero for the distribution of Tn,» these representing the first two -coefficients 
for this distribution. Before we do this, however, we may find from (Q), p. 276, 
the values of 9,B,, 9B, for the p, of samples of constant size M to the second order 
of approximation. They are 


1 (8,—36,— 68,42) ene (Bs —2  38,—218,- 188, + | 


A 
ub, = 


M (B2— i M B2- 1 3B, - 9B. — 188, +6 
<) rE ry ee fo ts. (BB), 
reducing with a Gaussian distribution to+ 
8 1 

yb , = a + i) i i ie ei ar) (CC), 
* The reader should note that Sa is not the same thing as if we put 
ce Sd Fi, Ls cie= ae sai 
Coe e Dore "vba Tin, IN' N2 myN ye 


from (N) on p. 275. 

+ These values are identical to the degree of approximation adopted with those given by ‘‘ Student,” 
in Biometrika, Vol. v1, p. 4. Professor Tchouproff (Ibid., Vol. x11, p. 192) does scant justice to 
‘‘Student.” The only misprint we can see is that 1/n appears in the first term instead of 1/n*, the 
power having probably been ‘ drawn’ in printing ; it re-appearsin the next equation. Further ‘‘ Student” 
gives not only Tchouproff’s (19) but his (22); this as ‘“‘ Student” himself indicates (p. 4) involved 
Tchouproff’s lengthy equations (20) and (21), which he refrained owing to their length from publishing. 
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1 B, —48,— 38? — 24,8, + 128, + 968, 


and By =3=- i es ee eS eae (DD), 
om Mu (B.—1) 
reducing with a Gaussian distribution to* 
, 1 
uD, —3= M : 


It will be observed that the approach to the normal curve is by no means close 
for fairly small samples. For example, if M= 24, we might easily have 9,B, =°3 
and 3B, =3°5. In other words the distribution of , in samples of size M is far 
from as close to a normal curve as the distribution of means p,’. 

We should anticipate accordingly that the distribution of the second moment- 
coefficient in the n, array of a sample V would be even further removed. 


We find by replacing 1/J/* by S(7 =] that 


(C2 (meh mea 


fy / Ny | avi Tp Bo at (\e 
= B, a 38, — 6B, +2 ( 3 4 BB. - —5 5). 
See ia igo oa Le = 
ps I Des Ni © ip 8; = 88, = - 6B, 42 
(a ae _ 9 , 18, 48,— 68. — 248, + 308, + 968, — ay 
\ Be ji Ny N Np (B. —1) 
Srekotee eae (KE). 
Hence 
8,—38,—68,+2(,. : Bs Se 8 (88. —5 
wa tBa Bi Br, 8 TAG, 8GH-o 
a, Niy\ B.-1 ~ B,—38, — 68, + 2/) 
Be, Ce ee (FF), 
giving for a Gaussian distribution 
ae 
np ae tea 
3, — 43, — 248 Wo) 38, —: e “ 
a py pa 2 Pe $i 2B, 4 B+ 90-33 aay 
? Np (8, — 1) a 
giving for a Gaussian distribution 
Me 
nyPy — 3 NG 


Thus for a small array of, say, 25 we might easily have np Bi = 37 and 


np De = 3°6, values very remote from a Gaussian distribution. 


It is clear therefore that the “probable error” of a second moment-coefficient 
has no very illuminating meaning in the case of the arrays of small or even 
moderate size in the case of a sample of size NV. It may, however, be remarked 
that the distribution of «, is one thing and that of o,, which is what we usually 
require is another. In order to obtain this we must raise the expression in (U) 


* See the second footnote on preceding page. 
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for 60 — {do} to the third and fourth powers. But if-we keep only terms of the 
two lowest orders in our results we require to ascertain the values of 


Ga GE): 


The necessary term in the latter is ~ (B: —1)', but the finding of the former is 
p 

far more troublesome and we have not so far succeeded in determining it. But it 
is possible to obtain some idea of the deviation of the Tny Curve from normality by 
considering what its 6’s are in the case of the sample being made on a population 
following the normal law. We are able to do this, if we carry a stage further the 
work outlined in a paper in Biometrika, Vol. x, p. 526. We require the values of 
>, cy? and ps, wy on p. 526 carried to a higher approximation by the introduction 
of the additional term in the Stirling’s Theorem expression for the factorial. 
Miss Pairman has carried this out and finds 


= 3 1 3 
Se [ig ee ee 
= ee @ on Ba Tae one oe 


which leads, having regard to (x1) on p. 525, to 


gp been, “Os i agar 
she = os? Mote >? = on (1 — Ay = =) se ee eens er rege (II). 
We must now turn to the equations on pp. 527—8 to determine yw; and py. 
We find* : 7 
oe 142-8 0+2\(0 3. 7 9 ) 
aH dn? ( 2n) — 4n? 2n 4n 32n? 128 
3 
= in (1 <r 2 as far as our approximation is valid......... (KK). 
: 5 1 avy 3 ) 3 fi SENG 
ss ed _ is ee 
eae ond ee 2n (4 5 ar + Bae] ~ an G i a) 
304 1 
= |) esasince sk Bie ostesesieasesesnie@essgenesssencceaeee LL). 
4n? (1 7a eee 


We must now replace n by m,+6n, and sum as we have frequently had 


occasion to do. 


Talore = 7 *e( =-7) 
We have: OT npby oi, 1+ de Ne ee a (MM), 
on eS 33 
= _ 2 Be Soe eleven eee Pet NN), 
ony us 4,2 (1 aE 4, N/ ( ) 
_ 3e%n, 5 3 
On Ma = ai? (1 + i, Ww) Seve ener e eee eeeeenns (OO). 


* These values may be used to determine the nature of the distribution of in samples of constant 


size n. We have: 
1 9 0 
3B, =5 (147), 2Bp=3+-. 


The term in 1/n? in sBy could not be determined unless we went to a still higher order term in 3 
Clearly for a sample of 25 3B, approaches close to the Gaussian and By still closer. The non-approxi- 
mate values are ‘0219 and 3:0014 (loc. cit., p. 529). 
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; 2 é 
Hence Jap = : (1 + as v) Sec eRe ee (RE) 


Accordingly we see that when samples WV are taken from normal material the 
array n, of varying size in these samples will not differ very greatly from normality. 
For example if %, = 25 and the sample NV be 1000, we shall have | B, = 024 and 


np 


B,= 312, showing no great deviation from normality, although more than in 


np 
‘le case of a sample of constant size. It is probable that the deviation from 
normality will be somewhat greater when the sampled population itself is not 
normal. Still it is important to note that the distribution of , for all cases is 
likely to be far closer to the normal, and the “ probable error” of Tnp therefore more 
intelligible, than is the case with np Ma In the same way it is extremely probable 
that the distributions of pee and (npbs)* are more nearly normal than those of 


Np Ms and Np M4: 


(V) Mathematical Contributions to tie Theory of Evolution. x1x. Second Sup- 
plement to a Memoir on Skew Variation. Phil. Trans. Series A—Vol. 216, 
pp. 429—457. 

There are one or two corrections to be made in this paper by Pearson: 


(a) p. 439, 1. 18. The printer has drawn the solidus and the 38,— 28.4 6, 
which followed it. ‘Thus 
4—m=2(2,.+38) 


should be read as 
4—m=2(B,+ 3)/(38; S 2it0)s 
; , 1. 18, about the middle of the page the equation 
b) p. 441, 1. 18, ab h iddle of the page tl juati 
y = 12 (sec 0 — cosec @) 
is given. It should be 
y=3 x 12 (sec 0 — cosec 0)?/sec 8, 


but no use has been made of the equation in the paper. 
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QUADRATURE COEFFICIENTS. 


In a large amount of recent quadrature work we have found that Sheppard’s 
formula (c) given in Biometrika, 1, p. 276, gives very satisfactory results, and 
Mr P. F. Everitt has tabled the values of the three coefficients. His manuscript 
table has proved so useful that we reproduce it here, as others may also find it a 
help. It will eventually appear in the Tubles for Statisticians and Biometricians. 
The formula supposes the quadrated area to be divided into p trapezettes on 
bases of equal size h. Then Ag the chordal area is given by 

Ag=hGatateat ... +2prt+h2p); 
where 2, 2, 2... Zp, 2p are the equally spaced ordinates. 

The required area of the curve is then 

Area=Agt G {(@— 4) — (Zp - Za)th 
— C5 (2 — 24) — (Zp — Zp-)} h 
+0; {(23 — 22) — (2-2 — Zp—s)} h. 


Here C, C., C, are certain functions of p and are provided for each value of p in 
the accompanying Table. They are selected to give the best result, provided we 
stop at third terminal differences. 


Quadrature Coefficients for Sheppard's Formula (c). - 


P. F.. Everirr 
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Biom. Vol. I, p. 276. 


P Cy 
6 |) +°2071429 
ie *1957755 
8 "1883296 
9 "1830517 
10 *1791068 
11 ‘1760430 
12 1735931 
13 *1715884. 
14|  °1699171 
15: *1685022 
16 *1672888 
17 -1662365 
18 °1653151 
19 *1645017 
20 *1637782 
at *1631305 
22 *1625472 
28. "1620193 
2 *1615391 
25 “16110014 
26 “1606982 
any *1603280 
28 “1599861 
29 ‘1596694 
30 *1593752 
31 *1591013 
382 *1588455 
83 *1586062 
34 |  *1583817 
35 “1581708 
36 *1579723 
37 *1577850 
38 *1576082 
39 *1574408 
40 *1572822 
41 °1571317 
42 "1569887 
43.|  -1568527 
44 |  °1567231 
4S *1565996 
46 "1564816 
Ai “1563688 
48 |  °1562610 
49 | *1561577 
50 *1560587 
51 *1559637 
52 *1558725 
58. *1557849 


C2 C3 | p C1 Cy 

+°3357143 +:°4714286 || 54 | 4°1557006 +4 -1034971 
*2532407 *2108218 I 55 "1556195 1033186 
‘2124868 1369312 56 "1555414 “1031470 
*1882653 10387946) 57 *1554662 *1029820 
1722411 0854119 || 58 1553936 *1028231 
*1608670 0738621 49 *1553235 "1026699 
*1523810 0659864 || 60 "1552559 *1025223 
°1458097 0602962 || 61 °1551906 ‘1023799 
*1405724 0560045 || 62 “1551274 "1022424 
"1363012 0526583 |) 60 "1550663 102 LO96 
1327519 0499796 || 64 “1550071 “1019813 
"1297561 0477890 || 65 "1549498 "1018571 
"1271989 0459655 || 66 “1548944 “1017370 
‘1249776 0444247 67 *1548406 “1016207 
‘1230418 ‘O431061L || 6S "1547884 “1015081 
1213364 70419654 || 69 1547378 *1013990 
*1198227 0409690 70 1546887 "1012931 
"1184701 0400913 71 *1546410 “1011905 
1172542 0393126 72 “1545946 “1010909 
"1161553 0386169 73 1545496 *1009941 
1151573 ‘0379919 Th *1545058 “1009002 
“1142469 0374272 70 1544632 “1008089 
1134131 0369147 76 "1544218 *1007202 
‘1126466 0364473 77 1543814 *1006339 
‘1119396 0360195 78 "1543422 “1005499 
“1112855 0356265 1) *15480389 “1004682 
“1106784 0352641 80 1542666 1003886 
*1101136 0349290 81 1542303 1003111 
"1095867 0346181 |) 82 1541948 -1002356 
‘1090941 703438290 | 83 1541603 “1001621 
*1086325 0340595 |S 4 1541265 *1000903 
“1081991 0338076 =| 55 1540936 *1000204 
‘1077914 0335717 | S6 *1540615 70999521 
"1074072 0333502 | 87 *1540801 “0998856 
"1070444 0331420 | 88 1539995 0998206 
‘1067014 0329458 89 1589695 0997571 
"1063765 0327607 90 1539403 (0996951 
“1060684 0325857 91 “1539116 0996346 
*1057759 0324201 92 *1538837 0995754 
"1054976 0322630 93 1538563 *0995175 
*1052328 0321139 D4 *1538295 70994610 
"1049803 0319722 95 *15388034 0994057 
*1047393 0318373 | 96 1537777 0993516 
"1045091 ‘0317088, || 97 1537526 "0992987 
"1042890 0315861 | 98 *1537280 0992469 
"1040783 0314690 || 99 1537040 0991962 
‘1038765 0313571 | 100 "1536804 0991465 
"1036829 0312449 


+ 


0311473 
‘0310489 
“0809545 
‘0308639 
‘0307767 
“0306929 
0306122 
‘0805345 
*0304596 
‘0303874 
0303177 
*0302503 
*OB01852 
-03801223 
‘0300614 
“03800025 
0299445 
“0298902 
‘0298366 
"0297846 
‘0297341 
‘0296852 
‘0296376 
“O295914 
*0295465 
*0295029 
0294604 
-0294190 
‘0293788 
0293396 
0293015 
0292643 
“0292280 
‘0291926 
*0291582 
0291245 
0290917 
°0290596 
°0290283 
‘0289977 
‘0289678 
‘0289386 
0289101 
0288821 
‘0288548 
‘0288281 
“0288020 
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ON GENERALISED TCHEBYCHEFF THEOREMS IN THE 
~ MATHEMATICAL THEORY OF STATISTICS. 


By KARL PEARSON, F.RS. 


(1) Single Variate. 


Let y= ¢ (z) be any law of frequency and let the limits of the distribution be 
a and b, then if NV be the total frequency, 


=| $ (de, 


and if @ be the mean value of the variate, 


N& =| «@ (x) da. 


Generally, if us be the sth moment-coefficient about the mean, 
b 
Nu. =| w= ay 6 (w) de 


b 
Now consider y= 5 | (a — %)* (x) da, 


and let « be any value of «—#, then 


Oe xf. oe f(a) da. 


Now pick out all. the values for which #— is greater than e, and let us suppose 
b>a; then 


adel f a 
33/8 —> (a) d 
pale > yf # (0) da, 


b 

and therefore hoaee = | $ (x) da, 
‘ N e+% 

since (# — #)/e 1s always greater than unity. 


b 
But + (a) da is the chance of an individual occurring with a deviation 
etz 


greater than e from the mean=1—P where P is the chance of an individual 
occurring with a deviation less than e. Hence 


PS ies 
e 


Now let «=o, where ¢ = Vp, is the standard deviation of the distribution. 


bo 
oe 
on 
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Thus the chance of a deviation being of less magnitude than Xo is 


Dh aan ee (i). 


This special case is Tchebycheff’s Theorem *. 

Inequality (i) gives our first generalisation for a single variate of Tchebycheff’s 
Theorem in (ii)+. We can now compare the accuracy of (1) and (11) by supposing 
them applied to a normal distribution of frequency for the cases of deviations 
1, 2, 3 and 4 times the standard deviation. In this case 


flog = (28 — 1) (28 — 3)... Lee’. 


TABLE I. 


2 cae ae 1. 
Values of Lower Limit for P given by 1 — CeCe : 


rs 
: | ee 
s K=15 | A=2 | N38} A=4 
1 ‘5556 = 7500-8889 ‘9375 
2 4074 8125 = _-9630 ‘9883 
3 — 3169 ‘7656 = “9794 9963 
4 ee = 9840 - ‘9984 | 
5 — = 9840 ‘9991 
6 IF est cs as | +9804 "9994 
i = = | ae “99950 
8 - ose xe ‘99953 
9 — = _ 99950 
10 = pe ee = “99940 
' Actual val 3 : D eo 
lr p> | 8664 ‘9545 9970 “99994 


Clearly the maximum for any X will be found by making (2s—1)/A? equal to 
unity, or if A? =an odd number, s = $ (A? + 1) and $ (A? +1) —1 will give equal limits. 
If 2 be an even number then s = $2? will give the highest limit. 


* It was first proved in the Recueil des sciences mathématiques, T. 11, according to Liouville, but I 
cannot trace this reference at all. It was translated from Russian into French in Liouyille’s Journal de 
mathématiques, Vol. x11, pp. 177—184, Paris, 1867. The proof there given is somewhat lengthy and at 
first sight the result might appear more general than (ii); but this is not so. Assume w=u+vu+w+... 
and suppose u, v, w uncorrelated, so that o,2=0,2+0¢,7+0¢,?+... then we have with minor differences 
of notation and terminology (especially the use of the words ‘‘mathematical expectation” for our 
moments) Tchebycheff’s own phrasing of his theorem. The remark of Dr Anderson (Biometrika, 
Vol. x, p. 269) with regard to the neglect of the theory of ‘‘mathematical expectation” by the 
English statistical school seems based on a misunderstanding of the moment method. 

+ This generalised form of Tchebycheff’s Theorem was given by me in a paper for the Honours 
degree of the University of London in Statistics, October, 1915. 
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On Generalised Tchebychef Theorems 


(2) Two Variates; Limit to the Frequency within an Elliptic Area round the 
Mean as Centre. 


Let the law of frequency be z= (a, y) and let the standard deviations of 
and y be o,, os, and r be the coefficient of correlation between «and y. Let us take 
as our ellipse, 

ray 


1 a Pe 
Tapa (On 2% + 0%) = 23 
1 


O05 op 


wv and y being measured as deviations from the mean. 

Then by giving special values to @,, @2, A. and y? we can get any ellipse we 
please. Further since the curve is to be an ellipse 1?@,.? < 6;,6.* and we shall take 
7, and @.. always positive. Thus x? and all its powers will invariably be positive. 


Now consider; if V = I[¢ (ay) dady, 


the integration extending all over space covered by the frequency surface. Divide 


eee). 


Ee 
v\ Tein 


2 


010% 


both sides by xo", 


4 7 x | f (xy) eae da dy. 


. Take out all the values for which y is greater than yo, then 
11 ffoen(&) 
a ay)(“}) dady, 
ye Ne N y 
when the integral extends over the area for which x is >y%. Hence — 


I, 1 
re > WV [¢ (ay) dady 


> chance .of an observation falling outside the 
ellipse yo. 


Let P be the chance of an observation falling inside this ellipse, then we have 
at once 


Now we define 


1 ff , aft 
Dss = [| b (wy) (uw — 2) (y — y)* dady 


= al d (ay) «sy dady 


in our case, as the s, s‘th product moment-coefficient about the mean. And it is 
very convenient to write 

Qs Ded (Cx On) esos ties Mt ecm (iv) 
and term qs a reduced product moment-coeftficient. 


* We shall generally wish to have symmetry of expression between # and y, and in this case we take 
029 = 64, =0 say and write 0,27/09=p and we shall have as necessary condition for the ellipse p<1. 
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It is clear that by simple expansion of the trinomial expression, we can always 
find J, in terms of qss’. 


We have accordingly to study the expansion of 


1 . 
(1 — r°)s (Oa? — 2r Oxy + Oxny*)® 
1 U=S M=S—U 6 6 0 s! oe 
a a mn 1 Uw Qu yu ot s—m—U ke m™m _ gsm 4 2mn-+U , 
(pee i— 1) es 2 (s—u—m)!m! u! y 


and if this value be substituted in the integral expression for J, we find 
1 CE M=s—U 


< (1-7) u=0 m=0 


I; 


s! 
(s - u —m) tm!a! Yos—u—2m, mie 


\- 1 ve Qu pu (hae Spell 6,.™ 


The lower values can be equally readily found by the expansion of 


(% = — 2027 “2 + Ao, r) 


1 010% Oo 


in powers of 7 by aid of the binomial. 


The first few cases are 


1 

i= a= 7) {0x1 Yoo = 20:57 Qu aF ERA 
1 

l= a2 ye {O° qQa0 ar CEE ar 2011 O20 Goa Fan) 46,7 igs oF 2093) 1F 4.27 doo}, 
1 

i = (dl —r) ag Yoo + 2s? Foe + 30, os (O11 G42 ar 92291) 


— 667 (A. dor + 0,, dis + 26,1 820433) 
ar 126,27° CAC ar Oo os) aa 86,77" Oss}, 
1 : Aue De 
1p) = ary ein Ge =F O04 Ges an 48); O20 (8117 Goo + Ox Oca) ste O01? Ose Ou 

— 80,r (O°9n + 34 Ox CAVES ata O20 os) + O37) 

+ 246,27? (O17 den te CL SOP: + 20;, Gus) 

= B20, 8 72 (ides + Cs Oas) te UG rt Qat a ae (vi). 

These expressions simplify for various cases, but it is clear that for the general 

case of unknown type of distribution we shall have to find very high product moments 
from the observations in order to use our generalised Tchebycheff’s Theorem. 
Otherwise we shall have to make assumptions as to the relations between high order 
and low order q’s. 


Since generally qo = m= 1 and q,=7, we have 
ii 
ie = T= yr? (0, ap 6:5 — 20,07). 


This suggests that for all cases we are likely to get simplified results, if we take 
91, = Ax = O12 = 1 when we find J,=2. In other words, simplification arises if we make 
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our ellipse that of the normal contours, although of course for the general case this 
will not be a contour of equal probability, although it may roughly approximate to it. 


-Thus we find for this case, 


IE J 
iit 
i aap 2)2 {40 + Gos + 2420 — Ar (Qn + Gis) + 47? Ooo, 
L 
I= a-ry {Goo + Yoo + 3. (Gaz + Qos) — GF (Gar + Gis + 2435) + 127? (Gus + Gos) — 877 Gas}, 
L,= — en + dos + 4 (oz + Yes) + 6du1 aI, (Qn + iz + 353 + 3455) 
+ 24? (Yoo ae 26 ar 244) — 327° (Gos sr Ya) ar 167‘ qui} orate rotat elas (vil), 


and the general value of J, will be 
i U=S M=S—-—U 


s! 
ie ET arhy py eee eee |? 
° (1 =P) 420 ano ) (s—u—m)! ml yl etm ete 


For the case of a normal distribution the q’s are all given in terms of r 
(Biometrika, Vol. x11, p. 87) and on substitution we find 


L207 3 oie ee ee ee 


generally J, = 2s (2s — 2) (2s — 4) ....2, which can be shown directly, thus: 


= ; 1 ibe be cae i 
= [| @ nal WY deedy 


One Be 


= le ey 2X" Vey ely SASS Leary, 
if we integrate by parts, 
=2s(2s—2)(2s—4)...2x[- e7 Mydy 
= 05 (082) Os ayer 
Accordingly our generalised Tchebycheff’s limit becomes 
Qs (2s — 2) (2s — 4) ... 2 
x 


and our best value of s will be determinable from 2s < x,’, or s must be the greatest 
integer less than or the integer equal to },°. 


IP Ss = 


Now the actual volume of the frequency surface inside the contour 


2 —.(4- ray +%) 


a NORE. Copy (ony. 0 OF 


So | 


. ~4y,2 Rares : 
is known to be 1—e ***, and it is thus easy to test the present generalised 


Tchebycheff limit as applied to this case. 


* This result is almost at once extensible to any number of variates following the normal distri- 
bution, but as the actual value of the probability is known there is no value in writing down this 
limiting value. 
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TABLE ILI. 


Generalised Tchebycheff Limit applied to the Probability that an association of two 
variables lies inside a given contour x,? of a normal frequency surface. 


| 2 Actual Minimum value 
Xo Probability of P 
4 "8647 ‘5000 (LX, 
5 “9179 “6800 (12) 
6 “9502 7778 (Jb) 
vi 9698 ‘8600 (J;) 
8 ‘9817 9062 (I, 
9 “9889 9415 = (14) 
10 9933 9616 (1, 
12 ‘9975 ‘9846 (J;) 
14 “9991 9939 (I) 
16 ‘9997 9976 (1, 
18 “9999 ‘9991 (Js) 
20 “99995 99964 (Ly) 
| 


Here as in the case of a single variate the generalised Tchebycheff limit is not 
very useful for low values of y,”. But if in any particular type of observation we 
consider it desirable to look with suspicion on an observation which has occurred 
and yet the odds against which are greater than 50 to 1, the Tchebycheff limit may 
be of value. As illustration, suppose two variates are correlated with intensity “7, 
what suspicion should we cast on an observation which gave the deviation of one 
variate 3°8 times its standard deviation and of the other 3:2 times? Here 

; 1 “we 2raey 
Xo = 1-7 ( i at e) 


Ore erCRy ony 


il 
= 5, (88) - 1-4 (88) (8-2) + (82)| 
= 15°01, or say 15. 
2° (7!) 

= "9962 

1B7 > 9962, 
or the odds are greater than 250 to 1 against it. Actually the probability of the 
occurrence of anything as unusual as or more unusual than this is ‘9994, or the 
actual odds 1700 to 1 about. For many purposes the odds of 250 to 1 would 
amply suffice to mark suspicion, although of course in the case of normal fre- 
quency it would be as easy or even easier to calculate the real probability as the 
generalised. Tchebycheff limit. 


Then P>i1 


The chief interest of the investigation thus far is to show that unless we use an 
I, of a high order the Tchebycheff limit is unlikely to be of very much service. We 
can obtain it in the case of material following a normal distribution, but then we 
know the exact result and do not need it! 
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I have considered very carefully the possibilities of deducing higher q’s from 
lower q’s for non-normal systems on various hypotheses as to the nature of the 
regression and the scedasticity. The simplest hypothesis is to suppose linearity of 
regression, homoscedasticity and homocliticity of both sets of arrays. 


es Bos = 8 («*)/o®, and Bq./VB, = 8 (a**) [oe 


as usual; let a single dash mark the #’s for the y variate, and double dashes the 
8's for the y arrays of «’s and triple dashes the f’s for the w arrays of y’s. Then 
if J be the mean of the «w-array of y’s, 


se Le aL = ; ; 
VB; = we (y/o = 7 2S (Ge + y/o, 


where 7’ is measured from the mean of the array, S is the sum for all members of 


ene er Ble ih 
the array and = the sum for all arrays. Thus if 7,= —* be the regression line, 
O7 


is ys 3 S4/i, ) S , py y2 
Vile = fr = (nett) 5 342 (n;27) S(y) 3; (ngw) S Cy DS 2s), 


O71 Oy Nye O7 Ny Oo" Nye 


: y’? a 
since S (9) S (ye) is to be the same for every array. Thus 
Ny Ny 


VBy = NB, + NB" (—7), 


Ae VB = VB =v Bs Mee en (ix), > 
(1 —7"*)” 

Similarly VB’ = Vae i vB Wee Perr see It i, (x). 
Gueeh 


Thus it is impossible in homoclitic systems for the skewness of the arrays to be 
equal to the skewness of the marginal totals if there be correlation *. 


Again we have 
Ras 
Bi = 1S (lor -75(= vty’) Jos 
=, + 6r? (1 — 7?) + 8.” (1-7? 
(ie eye We Uh ir) 


or ig” = 


qd as ry ES 
Leet Br : 
or, again, ps —3= iva ABs 8) ee ee (x1), 
(bay 
BNO UL pif AES 2 
and similarly, B.-3= Bs 4 =e Ee aN, (cm); 


* We note that if the marginal totals be both without skewness, all the arrays will also be symme- 
trical. Equations (xi) and (xii) show us that if the marginal totals be mesokurtic the arrays will also 
be mesokurtic. 
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Now consider qx», 
z 1 2942 yay 1 pags n 2g,2 
qa =p S(@y (ates) =H RSe (2 & y) | ovo: 


=Rr +(1- i 
= Bir + (1-2) 


by symmetry. Hence it follows that in linear homoscedastic systems 8, = 8.’, and 
accordingly 


By" —3= 8" — 3 = (Bo— 3) 
This is of interest as indicating that in linear homoscedastic systems the one with 


mesokurtic margins is the only one in which the kurtosis of the arrays can be the 
same as that of the margins. 


Again = q33= . (a4) /(oFo2) = 5 > Sx? G 2 a+ y) Joos 


na. 4 tees (2) $3 (ot) 962) 


Ny Oo; 


= Bi ar 3r el ae r?) ist + VB, V By Pie rB, sevecer cere rer eresrsseevecee (xv) 
=r Bet or(l = 7) By +N BY N By — 18 Byiis.s40seiensceeterss (xvi) 
by symmetry. 
It follows from (xv) and (xvi) that it is needful for 


; i Big, UB aos oka ties acae Ts oe na sitio Saziaies (xvii). 
Finally we have 


1 
qa = We (aty")/(o,407') 
1 2 ae 474 
= 77 BSc" G 2a +y') /o a; 
= 1 Bs + 6r° el sal 7) Bs — 4r? Bs + 4B; Vv By/B, ap [shee (1 oa Tye 


or 
qa = Bo + 6r? (1 = 7?) rey =a 47? Bs ar 4B; Vv By/B, = (ie SF —6r? (1 —?’) Be ar Bo By 
= By + 6r? (1 — 7°) By — 47° By + 48, VB,/By — 1B? — 6r? (1 — 7°) Bo’ + Ba’ Bo 


ee (xvi), 
which again involves the complicated @-relation 


figs (Bs a Bs) + 6r? @ ie i) (By a Bs) — 47° (Bs a Bs) +4 (BB: —8,B,) VB. By =0 


It is difficult to see how the form of variation of one character can be related by 
the correlation between that and another character to the form of variation of the 
second character as (xix) would indicate, If it were we should get into great 
difficulties in dealing with similar conditions to (xix) for a large number of characters 
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with different correlations. If as it appears to me (xix) would need to be satisfied 
independently of 7, then we must have 


Bs — 68, = Bs — 6B, 

38, — 28; = 3B, — 28,’ | 

8/8. = BB 

The second of (xx) by aid of (xvii) leads us to 
2 Bs ; 2 8; 
96 (1— 5g) =98 (1-5 i): 

whence #, = f,’, and as 8.=, it follows that 8;=8;, 81=8., Bs=8e, that is to 
say the total frequencies of the two correlated characters must possess variation 
practically of the same type. 


Now I find this is very far from being the case in distributions which differ 
widely from the normal correlation surface. Thus it follows that the hypothesis of 
homoscedasticity, linear regression and homocliticity fails for such cases. I therefore 
modified the linear regression and adopted skew regression, homoscedasticity and 
homocliticity. I again got relations between the 6's, but of a much higher degree 
of complexity. These were tested by Mr A. W. Young and myself on the skew 
correlation surfaces of barometric data, but were found to fail. Direct investigation 
afterwards showed me that while the regression differed to some extent from 
linearity, it was the homoscedasticity which was in the first place the erroneous 
assumption. The arrays were very far from having the same standard ‘deviations. 


Until therefore some theoretical advance is made in the investigation of skew 
regression surfaces, especially for those which have linear or nearly linear regression 
combined with heteroscedasticity, it is unlikely that we shall have any adequate 
method of determining high product moment-coefficients from low ones. We are 
accordingly thrown back on direct determination of the high product moment- 
coefticients, if we wish to determine a Tchebycheff limit. The work of determining 
I, would involve a whole round of 8th order moment-coefficients and product 
moment-coefficients. It would then give us a limit of the order ‘95 for 99. Lower 
order I’s would hardly give values of much importance,-and it may be questioned 
whether a rough limit of the kind required could not be better obtained by inserting 
the desired contour on a “scatter diagram” and simply counting the dots which 
fall outside it, or indeed by taking the best fitting normal surface to the actual 
distribution. The reader may question whether something better could not be 
achieved for skew correlation Tchebychetf limits by some contour other than the 
ellipse. This would undoubtedly be the case, if we knew the forms of the skew- 
correlation contours, for then we should undoubtedly choose this equi-probable locus 
for our boundary. But as we have only a knowledge of these empirically—experience 
shows them to be frequently pear or lemniscate loop shaped—we get little help for 
our present problem. 


One other aspect of the matter may be briefly considered. We may find a limit 
to the probability that an event or individual will lie. within a circle of radius R 
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round the origin. This corresponds to Schols’ Problem*. It may be useful to 
have a Tchebycheff limit for this case, although we have yet to meet the particular 
instance in practical statistics where it would be of marked advantage. 


We can best investigate this problem de novo. 


Let i [fe + y’)’ b (ay) dady. 
y 


Then if & be any radius round the one 
1,/R={{ (Ee red) play) dedy, 


the integral being taken to include the whole volume of the probability surface 
z=¢o(a,y). Now pick out those elements of the integral for which a? + y? is 
> R?, then 


1[R*> {| (= Be) (ay) dovdy, 


where the integration extends over the above-mentioned elements only, and is 
therefore 


>|] p (ay) dxdy, 


but this integral is 1 — P, where P is the probability that the individual falls within 
the distance R of the origin. Thus the Tchebycheff limit is given by 


de 
P>1- Bee 
Now clearly we have 
i, =|| (a? + 4°) 6 (ay) dady 
s(s—1l) 
= Pos,3 + SPos—s, 2 + 1.2 Prs—44 + ++. 
(s—1 
<= Oi Gan ot 8a23~2 oe isin 8 ) oso a1 ate 
Now write R= Vo + o.2, and further take tan 0 = o,/o,. Then 
Daeg | 
Ra 3G: Joos Odos,9 + 8 cos? 8 sin? 0 qoso,9 + - 2 —— ay) COs a Gis 7 Osean 
~1)(s—-2 
ans T a ) cos*—* 7 sin® @qos-6,6 + ni teas 
For the particular case in which s= 1 
cers il : 1 
ca 55 (cos? 8 + sin? @) = yee 
I; 
For s = 2, Fae ~, (cos! 88, + 2 cos? @ sin? 6q. + sin! 6B,’). 


* Over de Theorie der Fouten in Ruimte en in het platte Vlak, Verhandlingen der K. Akademie van 
Wetenschapen, Deel xv, pp. 1—68, Amsterdam, 1875. Translated into French in the Annales de UV Ecole 
polytechnique de Delft, Tome 11, pp. 123—178. Leide, 1886. 

t It is conceivable that the solutions given might be serviceable in the case of testing machine guns 
against a target. 
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Now a good approximation to g» by (xiii) must be $(@, + By) r?+(1—7°); 
hence substituting 


16 


Fo = (Be—3) (cost + r*sin® 8) + (8, — 3) (sin' 8 + 7° cos? 8) 


+ 3—-—4(1 — 1’) cos? @ sin? 6}....... 


For the special case of normal distribution, if we write «= 4(1—1*) cos? @ sin? @, 


Ty. 1 , 
R: = 4 (3 —-K ) tieletsieters 
Again 
i 1 6 € 9 sa ao, Osa sig, , 
a (cos® 0B, + 3 cos? @ sin? @ (cos? Aqy + sin? 6q.,) + sin’ OB, } ...... 


and for a normal distribution, 


Further general cases can be at once written down, but it will suffice to give 
here the leading values of J, for a normal distribution : 


f = 1 5 1 eee ; Il. , 
R22? Ri ee Rs 5 (15 — 9e?), 

= 3 f 945 — 1050K? + 225K 

oa 2 (05 = 90x? + 9x*), Rw =+( 5 — 10504? + x), 

I, 6 

Re =s, (103895 — 141 75x? + 47254 — 225«°), 

a=y dH (135,135 — 218,295K? + 99,225«4 — 11,025«°), 

a A (2,027,025 — 3,783,780«2 + 2,182,950«* — 396,900K° + 11,025«°) ....... 


The following table gives the maximum Tchebycheff limit for the probability of 
an individual falling within the circle \ /g,2+ 0,2 for various values of 


e=4(1—1r) o202/(o2 + o7). 


(I,) denotes the particular Z from which the maximum limit is found. (J; ”) 
denotes that the corresponding numerical value is a Tchebycheff limit found from 
I,, but it is not known whether J, would not give a higher value, J, not having 
been tabled. The second part of the table provides the values of J, from which 
the first part has been computed. They may be useful in the determination of the 
Tchebycheff limits for other values of 2. 


Values of «?. 
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I. Generalised Tchebychef Limit for Schols’ Problem with a Normal Distribution. 
Radius of circle =X Vo2+o2, «2=4(1—7°) oo2/(o2 + 0,2) 
h=|}] 1 1:25 15 2-0 | 25 3 35 40 
= 2 = = | | a 

0-0 | 0 (Ly) | 36 (Z,) | 5556 (1) | 8125 (%) | 9386 (Z,) | 98400 (44=J;) | 996924 (J) | 999528 (Jy) 
0-1 | 0 (Ay) | 36 (21) | °5556 (44) | 81875 (Zy) ‘9422 (I,) | 98574 (5) ‘997329 (Is) | “999611 (J,) 

0-2 | 0 (4) | 36 (1;) | °5556 (4) | 8275 (Zz) | 9459 (J) | 98740 (Z,) ‘997707 (Lg) | 999685 (J?) 
0°3| 0 (4) | 36 (Z,) | 5556 (1) | °83195 (Zp) ‘9496 (I,) 98899 (I.) | 998109 (Z;) , 999749 (1,2) 
| 0-4] 0 (1) | 36 (4) | 5556 (Z,) | 8375 (Zp) ‘9538 (4) | 99099 (Js) ‘998478 (I;) | 999805 (1,2) 

0-5 | 0 (A) | 36 (Z,) | °5556 (Z,) | 84375 (4) | 9592 (Z,) | 99193 (Z,) ‘998806 (Jz) | 999853 (Zs?) 

0-6 | 0 (A) | 36 (Z,) | °5556 (Z,) | 8500 (4=Z,)| -9645 (Z,) | 99333 (Jy) 999096 (Ig) ‘999893. (J, 2) 

0-7 | 0 (4) | 36 (1) | 5556 (4) | 8641 (Z,) ‘9696 (J,)  °99490 (Jg) ‘999380 (Ig) 999927 (Ly?) | 

0-8 | 0 (Z) | °36 (4) | 5654 (J) | 8781 (Z,) | 9746 (J,) | 99631 (Zs) 999609 (Ig) | 999954 (Jy?) | 

0-9 | 0 (Z,) | 36 (1) | -5852 (4) | 8922 (Z,) ‘9809 (I;) | 99770 (Jr) ‘999788 (Ix) | 999975 (Ly?) 
| 1:0} 0 (A) | 36 (7) | 6049 (4) | 90625 (7,=J,)| 9879 (Ze) | 99895 (Z;) 999920 (1.2) 999991 (1.2) | 
{ | | 


I. Values of the functions I, forming the denominator of the Tchebycheff Limit to 
the probability that an Individual will fall for the case of Normal Bi-variate 


Frequency within a given circle of radius \Vop+o.2. 


Ke I, Ip Ts Ty | TI; Ig I; Ig 

0-07). 1 3°0 150 105-00 | 945:00 | 10,395-000 135,135-000 2,027 0250000 
O01 1 2°9 14"1 | 96°09 | 842°25 9,024°525 114,286°725 1,670,080°7025 
0-2 1 2°8 13°22, 87°36 | 744°00 7,747°200 95,356°800 1,354,429°4400 
03! 1 | 2:7 | 12°3 | 78°81 | 650-25 | 6561-675 78,279°075 | 1,077,729°5025 
0°4 1 2°6 11-4 70°44 | 561°00 | 5,466°600 62,987 °400 837,665°6400 
05 1 2°5 10°56 | 62°25 | 476°25 4,460°625 49,415 °625 631,949-°0625 
Oo6 1 2°4 9°6 54°24 | 396°00 3,542°400 37,497°600 = 458,317°4400 
O7y 1 2°3 87 46°41 | 320°25 2,710°575 27,167:175 314,534°9025 
OS a 2°2 7°8 | 38°76 | 249-00 | 1,963-800 18,358 °200 198,392-0400 
0-9 1 2°1 6°9 31°29 | 182:25 1,300°725 11,004°525 107,705°9025 
1:0 1 2°0 6:0 | 24°00 | 120°00 | 740-000 5,040°000 40,320-0000 


The reader may be curious to know whether the Tchebycheff limit gives 
a better result for Schols’ circles than for the elliptic contours. The actual pro- 
bability of an individual falling within the circle of radius \ Vo2 + o,2 is given by 


2 
= a (1 — Kk’ cos 8) 


ate 
| 0 1l-K’cosO 
where x =V1—«? and =4(1 — 1) o0,2/(o;2 + 0,2) as before. 

I have not succeeded in finding any rapidly converging expansion for this 
expression *, and have been reduced to evaluating its argument and using aquadrature 
Thus for \= 2, «2 =°4, I find 

P = -963,3694. 


* Unfortunately Schols has not tabled P, but only gives the values of ) for ten values of x’, which 
occur when P=1/2, i.e. radial values for generalised ‘‘ probable errors.” 


Soe 


T 


dé, 


formula. 
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The process is not as long as it might seem. Indeed if we only need four decimal 
places, it is quite adequate to integrate only through the first quadrant, the second 
contributes nothing of importance. The value given by the last Tchebycheff limit is 

P >°8375. 
This is of the same order of divergence as we found for the elliptic contour, Le. for 
x. = 7, we had P = ‘9698, with a Tchebycheff hmit P >°8600. Thus the measure 
of approach does not seem very close in this case until we reach higher values of X. 


On the whole we must express disappointment at the results of the Tchebycheff 
process. We had found Tchebycheff’s own limit based only on the second moment 
of small practical value, although it is to be found occupying a prominent position 
in many continental works on probability. By extending it to higher moments and 
product-moments we have reached results which are great improvements on the 
original Tchebycheff limit, but the method still lacks the degree of approximation 
(except for probabilities over ‘99, say) which would make the result of real value in 
practical statistics. It is, however, conceivable that some more ingenious application 
of Tchebycheff’s idea may lead to a limit more close to the actual value of the 
probability. 


Plate I 


ka, Vol. XII, Parts III and IV 


iometri 


B 
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CHARLES BUCKMAN GORING 


iin 1912 


from a skctch by R. Ara 


CHARLES B. GORING, 1870—1919. 


“His work won full recognition from those who value scientific research. But 
it is a strange commentary on the Civil Service, that, when so pressing a problem 
as prison reform still confronts us, so fine a worker and so human a man should 
have been given but the (medical) administration of a great prison instead of being 
called in to deal with a work for which all his gifts supremely fitted him.” The 
Nation. 


The late Charles B. Goring, M.D., was a distinguished student of University 
College, London, and afterwards a Fellow of that College. During his career his 
studies were far from confined to medicine: he was much interested in literature 
and philosophy, being awarded the John Stuart Mill Studentship in Philosophy of 
Mind and Logic in 1893, probably the only occasion on which that studentship 
has fallen to a medical exhibitioner*. It was not therefore surprising to those 
who knew something of the remarkable powers of sympathy, the width of interests 
and the facility of expression which characterised Goring to find that he would 
write a blue-book, as no blue-book has been written since the time of Matthew Arnold. 
He would handle facts, but at the same time he would appeal by his imagination 
and gift of language not only to the sociologist but to every man who is fascinated 
by the human spirit in all its diverse phases. Goring lived with his criminals, and 
studied them in and out of prison as the naturalist studies life in the field, and as 
the humanist studies mankind in its thronged resorts. Ask Goring what a convict’s 

mind was like and he replied unhesitatingly: Like yours and mine. The same 
delicate spirit of sympathy that went out to his friends in both the joy and the 
sorrow of life, drew the criminal to him, and the link often grew so close that the 
prison medical officer became the father-confessor: the psychology of the criminal 
mind was laid bare, and thus Goring’s insight into criminality, its source and its 
motives, grew deeper and more and more coordinated as the years of service increased. 
Yet he never hesitated to exhibit the same tender sympathy alike to each new 
sojourner and to each oft returning old prison inmate, while his own nature widened 
and strengthened under an environment which appears to dull the mentality of so 
many men in the prison service. Only last Christmas the present writer dis- 
cussed with him the possibility of a series of essays on the psychology of crime to 
be based indeed on facts acquired by scientific study, but to exhibit a structure 
from which the scaffolding should have been stript, and which should convince 
the beholder of the fitness of its purpose solely by the beauty and truth of its 
lines. The path to truth is an arduous one, but when we have reached the 


* Goring was awarded the Weldon Medal and premium by the University of Oxford in 1914 and 
never will a more fitting award of that medal be made; his work ‘‘ The English Convict ” was undoubtedly 
the finest contribution to biometry of its quinquennium, 
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summit we know by the width of our prospect into all neighbouring spheres that 
we have attained it: 


“(ui veram habet ideam, simul scit se veram habere ideam, nec de rei 
veritate potest dubitare.” 


We may now have to await that work for generations until another prison medical 
officer arises with Goring’s scientific knowledge, discriminative sympathy and fine 
power of expression. Battling with a gaol epidemic of influenza, when he should 
himself have been in bed, Goring fell an easy prey to pneumonia, which a strong 
will coupled with a spare and delicate frame cannot resist as their combination so 
often does many of Death’s onsets. Goring died as he himself and his friends 
would have wished, doing his duty to the last at his post. His work was uncompleted 
as good men’s work so often must be. He was studying at the time the influence 
of the war on the nature and frequency of crime—a subject on which much will no 
doubt be said, but most probably with small scientific basis. How shall we estimate 
his work, now that he has left us? We pass by the criticisms of men inside and 
outside the prison service, for they will leave neither in their own productions nor 
in their criticisms anything that will remain of permanent value to the new science 
of criminology as Goring outlined it ; those who have had like experience lack either 
his insight, or his logical mentality, or his power of expression. They were not 
trained in the same school, nor had they the penetralia mentis, or rather what 
“the Romans called ingeniwm,’ which through its very innateness carries mankind 
onward a step, assured, not doubtful or to be retraced. The contest between 
mediocrity and inspiration is as old as history and the creator, the poet, wins, if 
not in life, yet thereafter. The world has yet to realise that achievement in every 
field is the product of trained imagination alone. Truth in science as in art is not 
the product of mere computation or careful observation, but of these guided by 
fertility of imagination. The creative mind has the potentiality of poet, artist 
and scientist within its grasp, and Goring’s friends were never very certain in 
which category to place him. Perhaps the specification was as difficult and would 
be as unprofitable as it must ever be in the case of the Florentine, the master 
spirit of this type of mind. 

To the present writer fell the good fortune to be in close touch with Goring 
(and his keen co-worker, H. E. Soper) for that long period of two and a half years 
during which “The English Convict” was in process of creation. He observed 
Goring in times of difficulty when the intertwined skein would not unravel, and in 
times of achievement when the tangle loosened as by magic. He realised the 
quiet persistency with which Goring grappled with the most intricate problems 
and the gentle satisfaction he exhibited when assimilating and recording a new and 
striking point. When finally the great manuscript had gone to press, we who had 
been working alongside him at our own tasks knew one and all that while we were 
losing a cherished daily intimacy, we had still individually gained a life-long friend. 
We felt that had the world been rightly organised—which it ever fails to be— 
a post in our midst would have been found available for Charles Goring, for no 
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man was better fitted than he to “study those agencies under social control that 
may improve or impair the racial qualities of future generations, either physically 
or mentally”; none we had come across was so well suited to make knowledge 
reached by scientific research a factor of social progress. He knew how to clothe 
scientific results in a garb which captivated the mental eye of him who listened to 
his spoken or read his written words. Goring was intended by nature for a master- 
craftsman of exposition. His sceptical spirit demanding a rigid foundation for 
truth was combined with an unlimited enthusiasm that truth when known should 
be proclaimed to the many. Yet in his own life, “Thrones, powers, dominions 
blocked the view, with episodes and underlings.” 


What then is the outcome of Goring’s work? Has he decreased crime or 
bettered the lot of the criminal? Not directly, the solitary individual can achieve 
little in this sense; he has moved stones from the path of the outcast, and we can 
picture many a criminal who would have wished to stand by his graveside. Has 
he pointed out the lines upon which the state in future should deal with its 
defaulters? Again not directly, but only indirectly. What then has he achieved ? 
He has given us a portrait of the criminal as he really exists; he has painted 
in the nature of his physique, he has indicated his facial and underlying mental 
traits, his hereditary tendencies and his home associations. And he has made for 
ever atypical the criminal of current drama and novelistic literature. Here it is 
that literature owes a deep debt to Goring. It cannot survive without its villains 
but the individual writer will never be as intimate as Goring was with poisoner, 
murderer and spy. Yet if that writer approaches with intuition not the masses 
of statistical data, but the text of Goring’s life-work, even in its recently issued 
abridgement*, he will learn to see the criminal as Goring saw him, he will learn to 
know the real man and his attitude to crime. He will learn that Goring was a 
creator in the literary sense+, and with imagination stirred he will feel the 
impulse to adopt and adapt that realistic portrait of the criminal as only true art 
can do. Through literature the world at large will know at last what crime and 
the criminal really are. Not only will literature profit, but the world which 
easily grasps truth when depicted by art will understand and gain something of 
the spirit of the man whose life’s work alas! is embraced within the livid wrappers - 
of a government publication. 


“En mands gerning er hans sjael, og sin gerning skal blive ved at leve pa 
jorden.’—The work of a man is his soul, and on earth his work shall not perish. 


* «The English Convict” (Abridgement), Wymans & Co., 1915. ; 

+ The present writer has many sins to atone for, but perhaps none he regrets now more than the 
stringency with which he docked the original MS. of Charles Goring of many of its literary qualities as 
unsuited to a scientific and government publication. 
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APPRECIATIONS OF CHARLES GORING. 


To the readers of Biometrika the following sympathetic accounts of the personality 
of Charles Goring will appeal as they do to the Editor, who deeply values the 
privilege of being allowed to publish these very intimate characterisations. The 
first is by Mr E. V. Lucas, a college friend of Goring’s; they both belonged to one 
of those periods of keen intellectual activity which arise occasionally in college life 
owing partly to the action of waves of external thought, but more often to the 
presence internally of one or two original minds. Outwardly the period in question 
was marked by the foundation of the Students’ Union and the meteoric brilliancy 
of The Privateer—a college journal that one did not grudge purchasing. It was 
for Goring the moulding time,—the golden days, when there was leisure to think, 
interpolated between an uncongenial office experience and the wider but none the 
less toilsome experience of a medical officer on a hospital ship during the South 
African War. 

The second appreciation is the oration bravely spoken over his grave by his 
widow. I have not ventured to leave out a sentence of it. Round the grave were 
gathered the friends of his creative period, the friends of his youth, the friends of 
his prison calling, from prison commissioner to warder, and a scattering of humbler 


friends unknown to most of us, but none the less there out of love to one of 
the finer spirits of this life. That brilliant June day, with its unique ritual, 
when we paid the last respects to Charles Goring, will remain in the memory of 
those present as unique as the nature of the man, who in leaving us reduces still 
further that little school of trained biometricians, who value humanism as well as 
science. 


I. CHARLES B. GORING AS A STUDENT. 


I have been asked to write a few words about Charles Goring, and I have 
tried, because I respect the asker; but they will be incomplete because I have i 
seen Goring of late so little and hardly knew him in maturity at all: as a husband, 
and a father, and an intellectual force with all his powers at their richest. But of 
the Charles whom, in the eighteen nineties, we knew, the Charles whom we loved, 
my impressions are fresh and will always be. His personality provided for that. 


I say “whom we loved,” but I think we did more than love. I think that if it 
were possible, if it were conceivable, that any harm should be coming to him, 
there is nothing we would not have done to interpose our own inferior bodies 
between him and it. For he inspired not only affection but protectiveness. We 
felt that we were his guardians: his—in a very peculiar sense—owners. Not that 
he lacked any qualities of self-defence. Far from it. His mind was crystal clear, 
his attitude to life and its problems was fearless; but he had an unworldliness, 
a childlike radiance, that seemed to demand from his friends a contribution of 
cotton wool, Let me say again that he did not need this, but we all wanted to 
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provide it. I have said that his attitude to life and its problems was fearless. 
But it was more than that: it was challenging and ardent. Had there been nothing 
to probe and inquire into, he would not have been the happy man he was; for he 
was a born inquirer—inquisitor even—and mistrusted all traditional face-values. 


Exactly how I came to be admitted to Goring’s circle I never understood then, 
and cannot now fathom. Because where he and his friends brought to their dis- 
cussions and disputations knowledge and seriousness, I had nothing but instinct 
and impatience. But they suffered me, and I was permitted to sit on the outskirts 
and listen, and now and then to interrupt. What I chiefly remember of those 
evenings—at all kinds of places—at Highgate, at Hampstead, in rooms near the 
Museum, on the boat to Margate, on the Broads,—what I chiefly remember is 
Charles in argument: eager, stimulating, vivid, humorous, always gently reasonable 
and never losing sight of the main proposition. I suppose he was the honestest 
and most understandingly tolerant man that ever lived. He never trimmed; he 
rarely condemned ; and he had no fear. No fact was too stark and naked for hin ; 
indeed, what he wanted was stark and naked facts. We would all have our say— 
some of us solid and some of us fluid—and then he would deal with us, with quiet 
Socratic questionings ; and all the while we would see, burning within his beautiful 
workmanlike brain, the soft steady flame of that lamp of enthusiasm which was 
never to be dimmed until a few weeks ago it was all too soon extinguished : 
enthusiasm for the truth, wherever found. 


Of what dark passages that lamp was to illumine it is not for me to speak. 
There are others who have authority. But that no sweeter nature was ever allied 
to a passion for scientific investigation I feel myself to have the right to affirm. 


E. V. Lucas. 


June 17, 1919. 
II. CHARLES GorING AS HUMANIST. 


In asking you all to come here today, I have done what seems to me a right 
thing to do, and a beautiful one: for, with your presence, I have made a circle 
round my husband’s spirit of those minds and hearts most intimate with his, and 
most valued by him....You all loved him; he loved everyone of you. With each 
one of you he had a separate and private friendship....It seems to me that I can do 
him no greater honour on this day than to give him what you have let me give him 
by coming here—your undivided thought of him, your clear memory, and the warm 
and poignant tenderness that I well know possesses each heart here at the very 
mention of his name—Charles Goring. 


I must ask you to forgive me if I read from this paper what I have to say, 
instead of speaking it, in a more natural manner. I should not find any difficulty 
in speaking to each one of you separately. It seems absurd that, simply because 
you are all here together, in a number, that I should find it difticult. Yet so it is. 
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And, therefore, for this reason, and also because on this occasion I can trust neither 
my memory, nor my self-control, I hope you will forget—won’t even see—this bit 
of paper between us. 

I want to say, first, why I am speaking at all. There are two reasons. One is 
that I want to say something about my husband which may, perhaps, for a few 
instants, trace an outline of him upon the air, for you as well as for me—which 
may, for a moment, mark out his features for us, give us a glimmer of himself. 
That is one reason. The other is that I want to make, at his grave-side, and in 
the knowledge of death, certain affirmations. 


I have great difficulty in expressing myself here. I will ask you for your 
generosity with your tolerance. I ask it the more particularly because I know 
there is at least one amongst you—and probably there are more than one—who 
will find my attitude and desire foreign to his own. 


To this person, who has my respect, affection, gratitude, as I hope he knows, 
I want to say that, though I understand his inability to speak to us here today 
about my husband—and, in a way, I love him for that inability—yet I do regret 
it; and also I do not accept his point of view. 


My regrets are for the fact that his silence deprives us of a criticism, an 
appreciation of my husband—of my husband’s scientific mind and work especially— 
which no one else could give with equal authority, sincerity and eloquence. So 
there is room enough for regret I think....And then also, as I said, I do not accept 
my friend’s point of view, though I can salute it for its dignity. 

His view is, I understand, that reticence, and silence, and solitude best suit the 
great occasions of human experience—those of grief and loss, particularly. J feel— 
I more than feel: I believe—the opposite. I believe in Voltaire’s saying: “Le 
but de homme c’est l’action.” Action means words as well as deeds. I believe 
that for whatever other purposes we may also possess life, there is a secret Injunction 
upon us—within us—to express things: to do, to make, to show. And it seems to 
me—it is more than feeling: it is a sort of moral urging—that when the great 
emotional experiences come to us, we ought to give them some outward, visible 
sion: Form: form, in accordance with that law that, as I said just now, seems to 
me to impose action upon us during our humanity: form that is beautiful. 


I have felt, then, in the great experience which has just come to me—the 
greatest I shall ever know—that unless I am to be false to my own instincts, and 
a coward to my own truth, I must testify by some outer form, and beauty of 
symbol, to the quality of my husband’s spirit, and the sacredness of his memory, at 
the hour of the burial of his body. 


Feeling, and believing this, I realise the disadvantage at which we stand—we 
who are Freethinkers—when, for our great occasions, we need a ceremonial, 
dignified, harmonious, simple. There, all the Churches, who have had time to grow 
old and beautiful, have the advantage of us. Their poets have had time to shape 
inarticulate cries and struggling aspirations into pathetic and stately ritual. Their 
artists have had time to bring colour, and line and music to the spaces set aside 
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for those who suffer, and those who conquer themselves. We here today—those 
among us who are Freethinkers—would not give up for these achievements, fine 
though they are, for our own best: the very essence that makes us Freethinkers. 
Nevertheless, the Churches, in this way, have the advantage of us. And when a 
person like myself wants, as I do today, to mark by outer form and beautiful symbol, 
the great spiritual experience that has come to me, there is no ceremonial—the 
legacy of the genius of ages—waiting for me: and I am ata loss. So disconcerting 
might this position have been, that it would have been easy to yield to the 
temptation that has for the last week assailed me: the temptation to do nothing; 
to give way to difficulty; to accept despair. All the time that I have felt it 
urgent within me to do honour, somehow, to my husband, on the day of this 
burial—all that time I have also felt an unworthy fear of the effort: and I have 
very often nearly decided to make none, but to have the ashes of his body buried 
without a sign, and myself alone as witness. 


I am glad I have cheated neither the memory of my husband, nor my own 
instincts, by doing that. I am glad that, by your presence, by the mysterious 
sense of the unity there is upon us in these moments—by the singing of these boys, 
whose music he loved, by these flowers, by the good fortune of an exquisite day of 
sunshine and warmth—I am glad that outward forms acknowledge the inward 
grace: I am glad that the influence of a lovely spirit is abroad in the air above 
this grave. 


In speaking of my husband himself, I shall have to choose one quality only of 
him, I suppose, if Iam to be clear. You will all know of others. And I shall not 
speak at all of his special intellectual gifts....I think, perhaps, his rarest and 
most endearing quality was his particular kind of humaneness. I say “ his particular 
kind of humaneness” because it was not in the least like what is called “ humani- 
tarianism.” He had no sentimentality. And he was never in the least taken in 
by humbug. But his humaneness enabled him to know, and to like, the humanity 
even behind the humbug. There was in him at once a complete lack of prudery 
and a perfect personal rectitude. Charlie was as incapable of being shocking him- 
self as he was of being shocked at another person’s shockingness....The fact is that, 
apart from cruelty, he did not take what is called “evil” very literally. He thought 
that nearly all people were intensely likeable when you got to know them. So 
that his charity—of which everyone speaks who knows him—was far less forgiveness 
than it was sympathy; and his kindness was always loving-kindness. 


If you will let me, I will tell you one or two things about him that may, 
perhaps, trace that silvery outline of which I spoke....I think of a certain day, 
some years ago, when we had a really wonderful walk together. It was one of those 
fortunate days—those gift-days—when everything turns out successfully ; when the 
unexpected leaps up; when there is adventure through it all. I won’t give you the 
whole history of the walk, but only these points—to show you Charlie. 


We had just come down Villiers Street from the Strand, and were near the 
Embankment Gardens, when he pulled up suddenly with a look of intense 
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alarm. He told me that one of his old convicts, discharged from Parkhurst, had , 
taken to newspaper-selling outside the Embankment Station. “ He talks for hours,” 
said Charlie desperately: “He has the eyes of a lynx. He spots me amongst 
thousands. He'll spot me in a minute. He always does. And that'll mean 
interminable conversation, and half-a-crown. Let’s get into the Gardens while 
there’s still a. chance.” So we dived for the Gardens, and were just through the 
gate, when he again pulled up.“ After all,” he said, “I have managed to give him 
the shp three times lately. It seems rather unfair to cheat the poor old boy again, 
so soon. Let’s go by the station.” So we went by the station; and he was 
duly pounced upon, and a lengthy, amicable gossip ensued; and the half-crown 
passed from one pocket to the other....Now that seems to me very like Charlie: 
that pang of conscience, that sense of fellowship which made him feel that, by 
evading him too often, he was, what he called, “cheating the old boy.” 


After this, we got out on to the Embankment. It was a wonderful day, I re- 
member, in early autumn. ‘The river was stiffly rippled; the plane trees were 
brilliant in colour and movement; rapid clouds were passing in the blue sky; 
bright traffic was flashing and humming by in the broad roadway : it was a delight 
to swing along the pavement, in the keen air, arm in arm—and quarrelling all the 
time, as we mostly did! Presently we reached the Temple, and passed upwards 
through the narrow passages and dark archways, and across the smiling silence of 
the Courtyards: and then out again, into Fleet Street; and up through Chancery 
Lane to Holborn; and so to the left, towards Oxford Street. And when we were 
in New Oxford Street, I suddenly became aware of an astounding apparition on 
the opposite side of the road. Charlie observed people very acutely when he was 
in close contact with them, but he didn’t notice things in crowds. He didn’t 
notice this man; and he continued to arguefy at my side, while I continued to 
amaze myself at the man. 


This person seemed to take up the whole street. It was not so much that he 
was so large, as that he was so blatant. His clothes were the most astounding 
things in vulgarity and newness that could be conceived. He wore a buttonhole 
that was an insult. And the way his boots shone, and his hat shone, and his 
walking-stick shone as he twirled it—the way he simply glared and revolved in 
glory, as it were, with Oxford Street as a mere margin for him—simply took one’s 
breath away. 


I hadn’t time to pull my husband’s arm, and stop his Infinites and Indefinites, 
before the whole bulk of this being was descending upon us across the road, and 
clasping Charlie with a fervent hand. I left my poor man stuttering in his grasp ; 
and went and looked into the Cameo Shop window. It was perfectly clear that he 
hadn’t the remotest idea who the man was, though he was pretending he knew him! 
And presently the volubilities broke down, and I heard this: “I don’t believe 
you know me, Doctor? I am....... ” JT didn’t catch the rest; but Charlie’s voice 
cleared up in relief: “Oh, of course; of course!” and then proceeded to rapid and 
friendliest conversation. 
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At last, with some terrific laughter, and a perfect blaze of complacency, I heard 
the man exclaim: “ Well, its all A.1—A.1, that’s what it is. A bit of All Right. 
Everything’s going swimmingly. We’re off to the Rhine for a few weeks’ holiday. 
Hope to see you again, Doctor. So long!”—and he was away, with a flourish of his 
hat, down the street towards Holborn; while my husband came up to me, convulsed 
with merriment, saying: “ Will you believe it? That's another of them!” He was, 
in fact, another convict. Also from Parkhurst. A very bad case of fraud, I believe. 
Quite unpardonable. Still, there he was: free again: at large: enjoying every 
moment of his regained existence. And, as we watched him disappear towards 
Holborn in his outrageous radiancy, the spectacle didn’t merely amuse Charlie or 
stagger him,—it didn’t shock him, and it didn’t sentimentalise him: but it made 
him rejoice at the thing in the man that could so rejoice in liberty; that could 
swagger so in the sun; that could be so little of a snob, and so free from the Past, 
that it could actually come bounding, in ‘camaraderie, to an ofticial of the Prison 
in which its convict days had been spent! That was all right in him, whatever 
else might be wrong: and it caught Charlie's atfection: he liked the man. 


I hope this also may give you a touch of your friend. It is so difficult, with 
heavy words, to convey an intangible thing. But I hope your own knowledge of 
him may give you the feeling of his quality in all this. 


There is one thing more I want to tell you about him. It was last March, and 
we were in Manchester. We were having rather a rough time of it. We had no 
servant, and most of the rooms of the house were shut up, to reduce work and fires. 
The kitchen was our children’s playroom, and it was our dining-room as well. We 
had breakfast there, one morning, and were, as usual, distinctly late! and my 
husband had to hurry off immediately after into the Prison. It was a bitter 
morning: there was a perfect blizzard of snow and rain. Five minutes after he had 
gone, I heard his latch-key in the door, and he rushed back to the kitchen in 
a tremor of excitement and pity. He said he had found a woman in the street who 
was so ill she could hardly move. She was coughing herself to pieces; and he 
thought she had consumption, or some virulent form of influenza. He had brought 
her back with him. She was in the hall. 


And here I have a. confession to make. I must make it, because if I do not, 
I cannot show you what Charlie was. I was angry with him. I was angry because 
I was in deadly terror of the children catching influenza. It seemed to me terrible, 
at the moment, to bring that poor, infected creature into the room where the 
children were—the one room where there was a fire. 


Well, it is the look he gave me when I was angry with him that I want to tell 
you about—a look in which there were not so much reproach and surprise (though 
these were there) as a kind of lovely guilt: a baffled look: a look pleading for 
pardon, and saying in desperation: “ Yes, I know, I know. But, in God’s name, 
what was I to do?” All this was in that look, which was the very essence of 
Charlie: and, without a word between us, I bundled the children upstairs, and we 
fetched the poor thing from the hall into the kitchen. 
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I can see him now, settling her by the fire, bringing her a footstool, taking her 
poor, dripping shawl off her shoulders, and hanging it up to dry. We thought for 
a time she was going to die; but she got better in a little while, and sat, un- 
complaimingly, coughing; and, when she was not coughing, smiling at the fire. 
He had to tear off to the Prison, as soon as he could leave her, through the 
frightful storm, promising to bring help for her as soon as he could leave his work. 


He returned later in the morning, with an ambulance, and an order for a 
hospital: and again now, I can see him leading her carefully through the hall, 
hfting her into the carriage, nodding at her affectionately through the doorway, as 
the carriage drove off; and then coming back to me for a moment, before he returned 
to his work, with that same muteness, that same look of an angel’s apology in 
his eyes.... 


This was Charlie absolutely—this passion of pity for suffering. In his last two 
days on earth, during the height of his delirium, one memory recurred, and haunted 
him over and over again: the memory of two little children whose case had been 
tried at the Assizes, and whose bodies he had had to examine, and had found 
marked and mutilated by the fiendish cruelty of their parents. These children he 
could not forget: he mourned and lamented them, seeing them before him in his 
fever, and calling, and calling upon us to take them, and save them.... 


If I had not told you this, I could not have shown you all that I meant by my 
husband’s humaneness: but I do not want the last impression that I, at any rate, 
leave with you, to be one of sadness. I want it to be one of happiness: because he 
was really an extraordinarily happy man. He was happy chiefly because of his 
nature and character, of course ; but, also, he was fortunate. He had got the things 
he most wanted in hfe. He never had any worldly ambitions at all. He had always 
wanted three things: first, freedom to live a life of the intellect—of observation, 
and of criticism ; and this he was able very largely to do, in spite of the fact that 
he also had to earn our living. And, secondly, he wanted Friendship: and he had 
Friendship. And, thirdly, he wanted Romantic Love: and he had Romantic Love. 
The things he wanted and hoped for when he was young, he found, and still wanted 
when he was middle-aged. And when he died at forty-nine, he took with him 
enthusiasms as eager as they were when he was twenty-five. 


Twill say no more except to read you the inscription that I shall be putting over 
the place where his ashes will lie. 


For a great many years, I have had in my mind a line of words whose music and 
meaning I very much liked. I only vaguely knew where it came from. It corre- 
sponded to the Christian triad: “Faith, Hope, and Charity”; and it ran thus. 
“ Love, Pity, and Equanimity.” 


During the last few days, when I was wanting to find something beautiful, and 
expressive of him, to put in words above my husband's grave, I thought of this line 
again: and I have found out that 1t comes from a Buddhist Sutta. Iam not very 
clear what a “Sutta” is? but I think it means a “Gospel.” This particular Sutta, 
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from which I have got my line, describes the being who has attained the perfect 

life—that is to say, the life of self-conquest. The passages from which I have made 
w 

my extracts are these : 


(1) “And he lets his mind pervade one quarter of the world with thoughts of 
Love, and so the second, and so the third, and so the fourth. And thus the whole 
wide world, above, below, around, and everywhere, does he continue to pervade 
with heart. of Love, far-reaching, grown great, and beyond measure.” 


(2) “Just as a mighty trumpeter makes himself heard—and that without 
difficulty—in all the four directions; even so, of all things that have shape or life, 
there is not one that he passes by or leaves aside, but regards them all with mind 
set free, and deep-felt love, pity and equanimity.” 


The inscription as I shall put it above the grave will be this: 


“ Here lie, in Sacredness and Honour, 
the Ashes of the Body 
of 
CHARLES BUCKMAN GORING, 

Doctor of Medicine, Bachelor of Science, 

Fellow of University College, London, 
and 

Medical Officer in Chief of Strangeways Prison, Manchester. 
Born January 31st, 1870; Died May 5th, 1919.” 


And underneath I shall put this: 


ate: Of all things that have shape or life there is not one that he 
passes by or leaves aside, but regards them all with mind set free 
and deep-felt love, pity, and equanimity.” 


KATIE MACDONALD GORING. 


ON THE NEST AND EGGS OF THE COMMON TERN 
(S, FLUVIATILIS). A COOPERATIVE STUDY. 


W. ROWAN, E. WOLFF, AND THE LATE P. L. SULMAN, Fieldworkers. 
K. PEARSON, Reporter. 


EK. ISAACS, E. M. ELDERTON, anp M. TILDESLEY, 


Tabulators and Computers. 


(1) Origin of the Material and Method of Measurement. 


This paper may be looked upon as a continuation of that published in Biometrika, 
Vol. x. pp. 144—168. It is based upon a census of the eggs made July 3rd—20th, 
1914, and contained in Rowan’s Fifth MS. Report on the Faunistics of Blakeney Point, 
the Field Station under Professor F. W. Oliver's direction on the Norfolk coast. The 
year was a record year for the common tern, a marked contrast to 1913, the young 
were abundant as well as the eggs, and many of the birds were still laying. Some 
peculiar nests were found: (a) one entirely of seaweed, (b) another of large wood 
shavings, (c) one of selected small pebbles, (d) a very large nest—the largest yet met 
with. Some of the nests are illustrated in Plate II and will suffice to indicate the con- 
siderable differences between their make up and environment*. The range of ground 
colour with extent and distribution of mottling are indicated in Plate III, which 
should be taken in conjunction with Plate VIII of the earlier paper. There is 
every reason to believe that the two clutches, each of three eggs, were in both cases 
due toa single bird ; the seventh egg, from a one-egg clutch, represents a peculiar egg 
found in the examination of this year’s material. In all 515 clutches were recorded 
as against 203 in 1913. In that year there were 13 clutches with 3 eggs each ; in 
1914 there were 198, and many of those with one or two eggs at the time had also one 
or two newly hatched chicks, bringing the total up to three. Even the nests with 
one egg (122 as compared with 119 in 1913) were actually nests with the first egg 
only of the clutch, for the birds were still laying, while most of the one-egg 
clutches in. 1913 were either deserted or the egg addled. 


Plate IV gives some further photographs taken of the Ternery. Fig. a is an 
attempt to catch the bird alighting in order to indicate the great length of the 


* The following illustrates a method of nest building, that of nest (d) above. ‘*‘A common tern laid 
close to the observation tent. At first there was no material whatever. But on the same day a few of 
the Psamma leaves from the tent were taken and deposited round the egg. The next day another egg 
was laid and more stuff was added. None of the Psamma had then been broken and the leaves radiated 
from the centre in all directions. On the second day the first few were broken and tucked neatly in 
allround. Then a third egg was deposited. More pieces of Psamma were added and the nest then had 
a very ragged appearance. It took two more days before the nest was completed and tidied up.” 
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wings. Fig. b is the only four-egg clutch observed. Plates V and VI show birds 
sitting, the camera being about 18 inches from the bird. 

The characters observed were identical with those of 1913, namely : 

1. Length (ZL); 2. Breadth (B); 3. Longitudinal Girth (G7); 4. Transverse 
Girth (4); 5. Tone or Ground Colour; 6. Mottling; 7. Type of Nest. 


The tone or ground colour was in 1914, however, divided into browns and 
greens. The scale of browns was that of the Colour Value Scale of Plate VIII of 
the first paper, and the green values were judged on a similar scale divided into 
corresponding classes a, b, c, d, e, f, g, 4,1, k. These classes are distinguished by 
the subscripts 1 and 2 for brown and green values respectively. ‘Two eggs only 
had to be excluded from these colour value observations; one was blue and the 
other slatey gray in ground colour. ‘These reduced the total number of eggs avail- 
able for colour value reduction from 1110 to 1108. The classification of mottling 
follows Plate IX of the earlier paper. Types of Nest were divided into three 
categories, ¢;=no hole in the ground and no materials, t,= a hole but no materials. 
t; = both hole and materials. As only one nest (with a three-egg clutch) occurred in 
type ¢,, we have grouped ¢, with ¢,, so that the distinction is really of unelaborated 
and elaborated nests. 

Of the characters dealt with, the transverse girth (G)) was really taken as a 
check on the general accuracy of measurements. We should have 

m7 = Mean Transverse Girth/Mean Breadth 
or rather a is equal to this ratio multiplied by the factor (Il — ry, Vq,0n + 0,°) 
where 7g, 18 the correlation of the transverse girth with the breadth and Vg, and 
Vz equal zt, of the coefficients of variation of transverse girth and breadth re- 
spectively. This factor was ‘99990 in the previous set of observations and is 
1:00006 now. Hence its influence on 7 =G,/B is insensible fur our purposes. 
We find 7 = 3:2071 against 3°2237 of the earlier series. Thus although the value 
of m is bettered, we still find the transverse girth is somewhat exaggerated, i.e. 
a is about 2°/, 1m error when thus deduced. It might at first sight suggest itself 
that the transverse section of the egg may not be truly circular. Suppose it an 
ellipse of eccentricity e. Then if we agree that it is equally likely that the breadth 
of the egg may be measured in any meridian we find 
ie 
Ere Ses rien fice 
Mean Breadth es 


if e be small. If, however, we put in the values found, Le. G,/B = 32071, we have 


e4 = 3320, 


leading to b=°6510a tor the relation between the semi-axes of the ellipse—a 
quite impossible value. It may be suggested that our chance of taking every 
breadth is not equal and that we are most likely to take the minimum breadth. 
In this case we should have 


G,/B = 1 (1 + te’), 
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and with our numbers e? = ‘0882, leading to b=°9576a—an improbable but not so 
impossible a relation as the former. It could hardly, however, escape observation, as 
even slightly distorted eggs are easily recognised. It seems, therefore, probable that 
the exaggeration of the girth in the transverse sense is due to the difficulty of 
adjusting the tape to the true maximum transverse section—the temptation 
being to bring the reading edge of the tape into contact with itself with the scale 
facing outwards. If we suppose the celluloid scales to be 0°5 mm. thick this 
would account for the deviation. Probably the longitudinal girth is exaggerated 
in like manner. 

Unfortunately it did not apparently seem possible for the fieldworkers to adopt 
a more elaborate system of classification for the mottling than was used in 1913 
and accordingly no further light is obtainable with regard to the difficulties 
suggested on p. 146 of the first paper. 


The question of possible pressure on the surface of the egg as it passes through 
the oviduct influencing the amount of pigment deposited was again investigated 
by considering the broader egg in each pair from the same clutch (see lc. p. 146). 


The broader egg in every possible clutch pair has: 


Greater mottling in 189 cases | More dense ground colour in 223 cases 
The same _,, Feet A oy | The same eer PO) «5 
Less . ay RHEE a ag Less dense hee 


Thus our 735 pairs confirm the previous result (on about 100 pairs) as far as the 
mottling is concerned, but not the density of ground colour. There is no dis- 
tinction in ground colour on the average between eggs of different breadths from 
the same hen, but the broader egg does appear to have less marked mottling. 
We shall consider later whether this result for eggs of the same clutch holds for 
the general population. 


(2) Change of Type of Egg with Season. 


We have: 
TABLE I. 
| Mean 
| 
| Character —-- | 
Season 1913 Season 1914 
| | 
Length Z ... ae .. | 4144-007 4:21 +004 
Breadth B ... vx .. = | 2°98+:004 3°01 + 002 
Longitudinal Girth G, ... |) 11:39 +015 11°56 + ‘007 
Transverse Girth G‘, | 9:59+-014 9°66 + ‘006 
Index 100 B/L_... .. | 72°04+°'136 71:75 +070 
Index of Ovality O ee 56°35 +°171 55°81 + 088 
[ee : a 


It is clear from this table that the eggs of 1914 were significantly larger than 
those of 1913. As the fieldworkers remarked before the eggs were tabled and 
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reduced, 1914 was a splendid contrast to 1913; never were so many birds seen 
and the young were as abundant as the eggs. At first sight it seemed strange to 
find such a flourishing colony after the comparative failure of the previous year, 
but in the summer of 1914 the channel was phosphorescent at night with Plankton, 
and probably as a result of this the channel was also swarming with myriads of 
“ Whitebait,” which in their turn attracted the Terns. The suggestion is thus 
thrown out that a plentiful food supply increases the size of the eggs. It must, 
however, be borne in mind that possibly only the stronger and bigger birds survived 
the previous bad season. There may have been fewer very young or very old birds 
and thus the eggs larger. 


We inay now consider the variabilities of the two years. 


TABLE IL. 


| Standard Deviation Coefficient of Variation | 
Character | 
1913 1914 1913 | 1914 | 
Length Z ee. cis ‘180 +035 “185 + :003 4°344°12 4:39 + 006 

| Breadth B Rae nee 099 + ‘010 099 + ‘001 3°33 +09 328+°005 | 
| Longitudinal Girth G7... 376+ 010 *350 + ‘005 3°30 + 09 3°08 + 005 | 
| Transverse Girth Gi, ... °347 + 010 300 + 004 3°62 +°10 3°10 +005 | 
Index 100 B/L ... cs 3°449 + 096 3°479 + 050 [479+:13]* | [484+ -069]* | 

Index of OvalityO —... 4334+ °121 4°326 + ‘062 7°69 + °22| | 7754111] 

| 


The table indicates that the material for 1914 is shghtly less variable than 
that of 1913 taken as a whole. ‘This is possibly due as we have suggested to the 
bad season of 1913 reducing the number of very young or very old birds and so the 
small eggs in 1914. But most of the differences are insignificant except those in the 
two girths. We anticipate that a good deal of interest from the evolutionary stand- 
point might be reached by secular observations on the eggs of this tern colony, 
taken in conjunction with records of the food supply and climate both in the 
nesting season and after. It would be of interest also to mark certain birds and 
record if possible their return. 


(3) Associations of Nest and Hgg Pattern. 


It is of great interest to discover whether there is any protective action in the 
colouring and mottling of the egg. _ In an egg which varies in itself so largely as 
the tern’s this question must be considered not so much in regard to the general 
nesting habits of the species, but in regard to the nest and environment of each 
individual bird. The occasional and possibly habitual practice (see our ftn. 
p- 308) of laying and nest building simultaneously may indeed suggest that the 
birds adapt the immediate environment and material of the nest to the actual 


* See remarks, footnote +, p. 147 of previous paper. 
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character of their eggs. Ifthe egg in shape, colour-value and mottling be related 
to the individual nest, it is hardly conceivable that a hen, especially when a young 
bird, can @ priort appreciate what the type of her egg is likely to be and prepare 
the corresponding protective nest accordingly. Such an instinct would be con- 
ceivable in the case of a species with more uniform eggs and building a specific 
type of nest; it is hard to conceive it possible in the case of such a wide colouring 
and mottling range as we find in the common tern. The alternative is to suppose 
a considerable variety of tern gentes, who like the suggested cuckoo gentes select 
a particular environment for their eggs. Such a suggestion is not without 
difficulty; it involves mating within the gens, or a transmission of the egg colour- 
ing mechanism through the female only. To accept the latter is not consonant 
with our experience that sexual characters of the female are transmitted through the 
male, ie. the fertility of the mare and the character of a cow’s milk are correlated 
with the like characteristics in their paternal grandmothers. It is conceivable that 
the pigmentation may vary to some extent with the immediate food supply. In 
this case green and brown eggs of the same shape and size within the same 
clutch might be more readily accounted for than by the hypothesis of two hens of 
different gentes using the same nest*. It might also admit of the hen having 
some inkling of the character of her forthcoming eggs, if the nest be made before- 
hand. Besides this it would free us from any hypothesis as to tern gentes. 
Thus far we have written as if the protective colouring of eggs was a demon- 
strated phenomenon. It is highly probable in the case of many species building 
specific nests in specific environments. Can it be asserted of the common tern ? 
If not, elaborate and most varied colouring and mottling would appear to be 
physiological, and originate before they attain protective character. In other words 
egg patterns have been specially selected for protective purposes, but did not 
originate in the survival of the better protected. 


It will be remembered that we have divided our nests into the unelaborated 
nests, i.e. nests with no material, and with no hole, or merely a hole in the ground, 
and elaborated nests or nests formed by a hole and with accumulated material. 
We shall denote these by S and C,1e. simple and complext. We will consider 
first absolute size as measured by the longitudinal girth, G;. 


The following table gives the data. The mean of the S-nest eggs is 11°556 as 
against 11°373 for the total population. The correlation found by the biserial r 
method was 

r=+ 0685 + 0322. 


* Clutch a, figured in Plate III, shows three eggs practically identical in shape and size yet of very 
different ground colour. Since the size is quite abnormal—being the smallest found in 1914—one can 
hardly believe that three birds laid three such eggs in one and the same nest! Again in the Psamma 
nest referred to in the ftn. p. 308, the three eggs were laid on three successive days; two eggs were 
alike in colour, but the third completely different. 

+ Actually of course every degree of elaboration can occur with a hole and every degree of accumu- 
lation of material. Thus although we have only two categories these cover practically continuous 
grades of elaboration and justify the use of biserial r method of determining the association. 
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This relationship is hardly significant, and if significant only of very small in- 
tensity. It would indicate that the eggs of greater longitudinal girth were on 
the whole deposited in the more elaborated nests. 

To investigate the matter more closely we now correlated the length and 
breadth of the egg with the nature of the nest, obtaining Tables IV and V. 

Here the mean of the egg lengths in the simple nests is 4177 ems. and for the 
total population 4°206 cms. while the correlation is given by 

r=+°0953 + :0321. 

This is probably just significant although only slightly larger than that for 

the longitudinal girth. 
TABLE V. 
Correlation of Nest Type and Breadth of Egg. 
Breadth of Egg. 


Totals 1 3 7! 19! 86! 60 128 | 180 234 | 219} 147 | 51 
| t \ | 


* 2°595— denotes all values from 2°595 to 2-645, i.e. all the recorded values to two decimals from 
2°60 to 2°64. 
Here the mean of the egg breadths for the simple nests is 3:028, while for the 
total population it is 3013. We have 
r= — 0952 + 0321, 
or the broader eggs are on the whole in the less-elaborated nests. Thus far then 
the rough nests appear associated with a short broad egg, although the correlations 
are only slight. 
With a view of analysing this point further we now investigate the correlation 
of the index with the type of nest. 
TABLE VI. 
Correlation of Nest Type and Egg Indea B/L. 


Values of Index. 


| | | Laas Walsh lose slates ni 
SS) 9 wD 9 Xe) 9 wD wD wD wD ites WwW 
3/8) 4) 3) 40) 8) woe) et les : 
33 | 71| 168 | 229 | 213] 142] 59] 23) 4 |—] 3 | 1 |— 
a a a | 
36 | 79 | 182| 260] 253/175 | 691a7/ 4/0] 3/1 | 0 | 
| 1 i 


* 57-95— denotes all values from 57-95 to 59°95, i.e. all recorded values from 58:0 to 59-9, the indices 
being recorded to one decimal place, 
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The mean index for the rougher nest was 72°530 and for the general popu- 
lation 71°752. We find for the correlation: 


r= — 1372 + ‘0319, 


a value greater than in the case of either length or breadth, the less elaborated 
nests having the rounder egg. 


We now took C= B? x Las a rough measure of the volume of the egg and 
found : 


r = — 0223 + 0322, 


or r is sensibly zero. In other words there is no relation of volume of egg to the 
type of the nest. Since we might suppose the younger bird to lay smaller eggs, 
or at any rate less broad eggs, the solution of the simple nests being due to 
young birds finds no confirmation in our analysis; it is the shape of the egg 
rather than its size which is associated with its euvironment. In order to test 
this further the lower portion of the axis Z—3B and the Second Index of 
Ovality*, 100 (LZ — 48B)/B, were correlated with the type of nest. 


They gave respectively : 
r= +°'1233 + 0319, 
and r=+ 1492 + 0318. 


In other words the greater the extension below the hemisphere and the greater 
the ovality the more likely the nest to be elaborated. Thus we see that the rotund 
egg is more characteristic of the careless nest. It is conceivable that the rounder 
the egg the less likely it is to catch the eye when laid amid small pebbles and 
shingle. We next turn to investigate the association of colour and mottling with 
type of nest. First we inquire as to the simple relation of green and brown to 
the nest. Here we cannot go further than a fourfold table: 


* The relative advantage of O2=100 (I~ 4B)/B and 0,=100B/(L - 4B) consists solely in the ovaloid 


character of the egg increasing as QO» increases, while it decreases as QO, increases, Hither may really 
be used indifferently if this be borne in mind. 


21—2 
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TABLE VII. 
Type of Nest and Colour. 
Type of Nest. 


T otals | 


C 


5 | Brown 439 

ZS | Green. 669* 

Ss | SSS 
| 1108 | 


Totals 


- One ‘slatey grey’ egg and one ‘blue’ egg had to be omitted from this table. 


We find for tetrachoric r 
r= +0745 + 0409. 


This cannot in itself be considered significant. The sign indicates that green 
egg-layers make the more elaborate nests. No stress can, however, be laid on the 
result. 


We now take mottling and type of nest using the arrangement below as the 
best order we could devise of decreasing mottling. 


TABLE VIII. 
Mottling and Type of Nest. 
Categories of Mottling. 


| Type of Nest | e | g | a+b Totals 

| § 143 | 
Cc vad 965 | 

Totals 1108 


The method adopted was that of ‘biserial 1’ with class index correction for the 
mottling categories. We find, the class index correlation being ‘9534, 


Correlation = + °1141 + 0325, 


the sign indicating that the finer blotches are associated with the more elaborate 
nests. 


* The preponderance of green eggs over brown in the ternery at Blakeney Point deserves con- 
sideration because it has not always been recognised. H. Seebohm, Lggs of British Birds, London, 
1896, writes that the eggs ‘‘ vary in ground colour from pale greyish-buff to brownish-buff, occasionally 
with a tinge of green” (p. 102). F. O. Morris, Natural History of the Nests and Eggs of British Birds, 
London, 1892, gives a wider range of colours, ‘‘pale blue, pale yellow, green, brown, white or light dull 
yellowish or stone colour” (Vol. m1. p. 136), which certainly does not emphasise the broad alternative 
categories brown or green, with a fractional percentage of blue or grey. 
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Lastly we turn to the intensity of the ground colour and the type of nest. 
Here we have worked independently brown and green eggs and the results are 


given in Tables IX and X. 


TABLE IX. 


Type of Nest and Density of Colour. in brown Eggs. 
Density of Colour. 


C, | dD, x, | PAG, H,+h4+ Kh, Totals 


5 12 8 63 
58 83 : 376 
439 


Again we use the ‘biserial 7’ method and correction for class index corre- 


lation (9785); we find 
Correlation = + °2189 + ‘0481. 


Thus there is significant, if only still very moderate, correlation, the relation- 
ship being between denser brown ground colour and the simpler nests, i.e. holes in 
the ground. 


TABLE X. 
Type of Nest and Density of Colour in Green Eggs. 


Density of Colour. 


Totals | 


| 
| 
| 
| 


Using the same method as before (class index correction ‘9860), but with one 
more category as F, and G, could be separated as their total was more consider- 
able, we have 
: Correlation = — ‘2366 + :0407. 

Thus the dark tones of green are on the whole more frequently associated with 
the nests to which material is brought. 


Accordingly in the case of both ground colours, although we cannot definitely 
assert that either brown or green egg-layers are the more elaborate nest builders, 
we can assert that the denser brown and lighter greens are somewhat more usual 
when the nest is a mere hole in the shingle, and that the lighter brown and 
darker green eggs are associated with more elaborately constructed nests. Again 
the larger blotches are in somewhat greater proportion to be associated with 
unelaborated nests and the finer mottling with the elaborate nests. There is no 
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reason to believe in any appreciable ditference in volume of the simple nest and 
the complex nest eggs, but the former differ somewhat in shape from the latter, 
being broader and shorter, 1.e. the eggs in mere holes are more rotund and in the 
elaborate nests more ovaloid. 


Although none of these characters appear to be highly correlated with the type 
of nest as determined by the simple alternative categories adopted by the field- 
workers, yet they are of a nature which more or less lend themselves to explanation 
on the basis of a protective colouring. _ It is not possible to determine whether 
the great variety of colouring and mottling in the common tern’s egg is a vestige 
of an elaborate system once developed for protective purposes, and now falling 
into disuse, or, as a product of physiological causes, it 1s now being slowly adapted to 
protective purposes. The problem is a very interesting one and further light we 
think might be thrown on it, if a fuller record were in future to be made of the 
immediate colouring of the nest,—the colour of the materials out of which it 
is made, and in the case of holes the colour of the ground, shape and nature and 
colour of the adjacent pebbles or shingle. It would mean much additional labour, 
but considerable information bearing on the points discussed above might arise 
from such data. 


(4) The Problem of the Mixed Colour Clutches. 


We propose in this section to discuss the problem of the mixed colour clutches. 
The following are the data to be analysed : 


TABLE XI. 
Colour Composition of Clutches. sae 

aed Number | Colour Composition ot Bene 

1 138 74B+63G+1SG 138 

Q- -b--178 -- + 67 BP+19 BG +92 G2 so ee eae 

3 204 62 B48 BG 414 BG24+119 G341BL 612 

4 I} 0 BA+0 BG +0 BG? +0 BG +1 G4 4 

= | eaanerus 
Totals 521 | 203 B only, 41 composite, 275 G only, 2 anomalous | 1110 

| 
| a = — ———— — 


Br=n brown, G=m green, SG Sia ee ey, BL=blue eggs*. 


Putting aside the two anomalous eggs, we have 41 clutches out of 519 wherein 
brown and green eggs are mixed. Putting aside the clutch with 4 green eggs we 
see that as a whole there are 


443 brown eggs to 659 green eggs, 


* The blue egg may be accounted for by the oxidisation of a green egg—a phenomenon observed by 
Newton (Art. ‘ Birds’ Eggs,’ Encycl. Brit.) ; the origin of the oxidisation being unrecognised in this case. 
Newton also states that the individuals of some few species of birds do not always lay eggs of the same 
ground colour, but the source indicated by him, i.e. change with age of bird, would not apply to our case. 
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but the proportions vary with the size of the clutch ; for we have 


74 brown to 63 green eggs in the clutches of 1, 
153 brown to 203 green eggs in the clutches of 2, 
216 brown to 393 green eggs in the clutches of 3, 


or 100 to 85, 100 to 133, 100 to 184 brown to green eggs respectively. In other 
words the proportion of green to brown eggs increases with the size of the clutch. 
Those readers who will examine Plate VIII in the first memoir* will see how 
distinct the brown and green ground colours are, and will understand how necessary 
it is to find some explanation for the change in proportions of colour as the clutch 
increases in size, and for the mixture of colours in the same nest. The field- 
workers appear to be confident that the same bird can lay different coloured eggs, 
basing their statement apparently on diversity of colour appearing in clutches of 
eggs having the same size or shape. 


The hypotheses that suggest themselves are : 


(i) That the common terns consist of two gentes one of which lays brown and 
the other green eggs. The mixture of colour arises from the existence of ‘ cuckoo’ 
terns who lay in other hen’s nests. 


We cannot ascertain the number of brown egg-laying tern ‘cuckoos’ who lay 
in brown egg nests or of green egg-laying tern ‘cuckoos’ who lay in green egg 
nests. But if the 19 BG arise from cuckoo-terns, we must originally have had 
74 + 63 + 19 single egg nests and in these 156 nests 19 tern ‘cuckoos’ of opposite 
colour laid. The chance therefore of a tern ‘cuckoo’ of opposite egg colour laying 
in the 1 egg nests is ‘1218. Treating the 2 egg nest in the same manner, we 
have 67 +92 +8+14=181 of them and in 22 we have occurring the egg of the 
tern cuckoo of opposite colour, or the chance is ‘1215; this number is sub- 
stantially the same as we reached before and the coincidence is remarkable. But 
it collapses when we go a stage further. We have 62 + 119 whole colour clutches 
of 3, we should therefore expect 25 clutches of 4 with composite colours, Le. 
25/(62 +119 + 25) ='122 nearly. Now only a single 4 clutch nest was found and 
this had all green eggs. With a chance of about 1 in 8 that a cuckoo-tern will 
lay in any nest, it is hard to believe that it missed at least 181 nests. It appears 
that three eggs is the practical limit to the size of the clutch laid by one hen, but 
it seems hard to believe that the cuckoo-tern would avoid all nests which already 
had three eggst, ie. the cuckoo-tern hypothesis seems to involve a considerable 
percentage of composite four egg clutches, which do not appear. This argument 
seems sufficient to render the hypothesis very improbable. 


(11) There is only one gens of the common tern which can lay both brown 
and green eggs. Since, however, the number of green eggs increases with the 
size of the clutch, it is not possible to consider the chance of laying a brown, 


* Biometrika, Vol. x. p. 146. 
+ Or that the rightful owner having laid two eggs would refrain from laying the third because the 
‘cuckoo’ tern had already laid it. 
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respectively a green egg as the same in successive layings. The change of pigment 
in successive layings may be a physiological exhaustive process as a change from 
a melanin to a lipochrome. This hypothesis does not assume that any given bird 
may or may not lay a green or brown egg according to a given law of chance, 
but that physiologically there is a tendency with successive laying to alter the 
nature of the pigment in the glands or on the surface of the oviduct. For 
example the hen, as the incubation period approaches, may change the quantity 
or character of her food. 


It is probable, however, that the changes will not be the same for small and 
large layers, we shall therefore give generality to the problem by supposing the 
probability of laying a brown egg to vary not only with the number of eggs laid 
but with each egg. 

We have then the following system of notation: p,’, ps’, ps”, ... = chance of 
laying a brown egg in the Ist, 2nd, 3rd,... laying of a hen who lays a clutch of 
s eggs. The corresponding chances of laying green eggs will be g,; =1—p,, 
de =1—ps qs’ =1-—p,",..... Let there be WV, s-clutch common tern hens. 

Then our data are to be provided by the equations: 

Nip, + Niq! = 744+ 63 (N, = 187), 
Ni pepe” + No (ps qo’ + G2 po’) + Noge'ge” =67+19+92 (N,=178), 
Npsps "ps" + Ns (ps"ps'' qs + paps qs" + Ps Ps Gs”) 
ws + p393"qs" + psqs qs’) 4 N, Gs. Gs 93.0 
= 62484144119 (N;= 203). | 


+N, (ps Gs Ys 


Dividing out by the totals in each case and equating corresponding terms we 
have the following system of equations to solve: 


i 


Pi ='540,1460, gq,’ ="459,8540 
Pe Po = °376,4045, ps qo” + po G2 = °106,7416, gogo” ="516,8539 ......... (ii), 
Ps Ps ps’ = °305,4187,  ps'"ps"Gs + ps ps'Qs’ + pss Gs’ ='039,4089, 
Ps Qs, Qs + Ps'Gs Qs + Ps’'Gs 9s = '068,9655, 93's qs’ = '586,2069 .. .(iii). 
(1) is solved as it stands. But it is clearly impossible to take g,’ = q' for this 
would involve q,” being greater than unity, an impossible value. Similarly 
qs and q;" cannot be equal to q,’ and q,’ respectively, or we should have gq,” >1. 
Thus it is needful that the probability of laying a green egg should increase 
with successive eggs or be a function of the fertility. Assuming this change of 
probability, we may write the first equations of (ii), the third is not independent : 
po pa’ =°876,4045, p, (1 — ps”) + px” (1 — po’) = "106,746, 
which gives us p, +p.” = °859,5506, or p,’, p.” are roots of the quadratic 
pe —°859,5506 p. + '376,4045 = 0. 


These roots are ¢maginary. 
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Turning now to (iii) we find from the first three equations : 
pa'ps'ps" = °305,4187, 
Ds Ds + ps "Ds + Ds Ds — *955,6650, 
ps + ps + ps” = 1:064,0394. 
These lead to the cubic for p,, 
ps — 1:064,0394 p2 + 955,6650 p; — °305,4187 = 0. 

One root of this cubic is p; =°449,5251, which gives on dividing the factor 

Pps — °449,5251 out 
ps; — °614,5143p, + °679,4254 = 0. 

The roots of this quadratic are both imaginary. 

Accordingly neither the records for the nests with two eggs nor those for the 
nests with three eggs are consistent with a single gens the hens of which lay 
brown eggs with a tendency to lay green increasing with greater fertility. This 
hypothesis has therefore to be discarded. 

(ii) As a last hypothesis we will assume that there are two gentes or types 
of females, one of which lays brown eggs (p,) with a small chance of laying green 
(q.=1-—~p,), and the other of which lays green eggs (p,) with a slight chance of 
laying brown (q,=1-p.,). Let N,v;, Ns(1—v,) be the number of brown and 
green laying hens in the group WV, which lays s eggs in the clutch. We suppose 
p, and p, to be independent of the fertility of the hen, until this assumption is 
shown to be inadequate. 

Clutches of 1 egg. 

Nyy, p, + Ni (1 — ») @ = number of brown eggs = Mye,, say, 
Ning +N, (1 — 1) po = number of green eggs = Me,’ say. 


For our special case : 


‘540,1460, 
‘459 ,8540. 


VYypy + ‘al — 1) (2 


ng, + (1 —%) pe 
These equations are not, however, independent and only suftice to determine 1, 
from 
Pie, —105)/ (Pp Op). vires AUautans Uevadeserevec cs (iv), 
or the proportions of brown and green egg layers in clutches of one, when p, 
and q. have been found. 
Clutches of 2 eggs. 
If the distribution of clutches be JV, («,’ + €.’ + €,”) 
Yop + Ci 7) qe = a; 
© Von + (1 — 74) Q2P2 = $e, 
VG +(1 = Vy) Do” = es 
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Only two of these equations are independent and it is convenient to write 


them in the form: 
Yop, + (1 — w) qQ2 = e+ 262 | © 
CoE are eee eee V). 
Mepr + (1 — v2) G2 = & | 
These will give, if p, and q, are known, one equation for the determination of 
v, and one equation of condition. 


Clutches of 3 eggs. 
If the distribution of clutches be N (e/” + 6” + 6)” + €,/”) we have: 


VD a (1 cal V3) Qo = ere 
V3PrQ + (1 — vs) qo*Ps = Gs 
V3Prqy + (1 — ps) qop2 = he”, 


vg +(l—»v;)pe =e”. 


Only three of these equations are independent and these may be written : 


" 


V3p)? + (1 — ps) q:? = e 
vp + (1 — vs) qr = ey” + Fey” eters tae so eae (vi). 
Usp, +(1—3)q@ = e+ 26” + te, i) 

These suffice to determine, p,, p, and qo. 


“mt 


Uniting the right-hand sides of (vi) fy", fi", fi” respectively we find : 


Ve= (fn = Gs) CPi Ga)” 2 ornare eee (vil), 
which suffices to find », when p, and q. are found, and 


pa =f AeA" —&)) 
pi = Gis =f; Git. = q)| 
which lead to the quadratic for q, 
aA AN AOR ARIE IE8R 0 sec 
We could therefore solve (ix) and choose the appropriate root for q., find the 
corresponding p, from (vii1), determine v, from (vil), v. from the first of (v) and », 
from (iv) We might then use the second equation of (v) as an equation of 
condition. But clearly this would not be satisfactory as all our quantities are 
subject to considerable sampling errors. The correct method would be to deter- 
mine 14, Y%, v;, p, and gq, from the sia equations (iv), (v) and (vi) so as to get the 
best values of these variables. But this would be a very laborious process. We 
propose therefore to determine p, and q, by the method of least squares from the 
three equations 
P= (hl fl @) Mf — o)* 
=e —fi Ga Sa) Weer eee eae (x); 
pe hl =f Ge =a 


* Obtained by writing f)” =e)" +49”, fo’ =e)" and eliminating v2 between two equations of (v). 


A Cooperative Study 323 


using to obtain linearity q@ = @ +, where g, is the value given by the quadratic 
(ix) and 7 is supposed a small quantity with negligible square. The values of 
p, and q found from (x) will be good, if not the best. Our system for 1, v2, Vs, 
Di, gz Will not be the optimum possibile, but if our system is probable, that will be 
still more probable and the hypothesis of the two gentes of tern hens will not be 
contradicted by the data. 


Our system of e’s is: 

e/ = '540,1460,  ¢’ =°376,4045,  «/” =:106,7416, 

e/” = °305,4187,  €,”" = 039,4089,  e,'"” = 068,9655, 
leading to: 

fi =°429,7753, fy’ ='376,4045, 

fi” = '354,6798, fo" ="318,5550,  f," = 305,4187. 
(ix) now becomes: 

192,7573q.2 — °192,4337q, + 006,8485 = 0, 


giving the small value q,= ‘036,9571 for the chance of a green gens hen laying a 
brown egg. 


We now return to (x) substituting the /’s and ‘036,9571 + 7 for q,. Expanding 
and neglecting 7? we obtain, on extracting the root of p,° in the third equation 
p, ='917,7816 + 1:242,3209 n, 
pi = '961,3638 + 1:909,4759 n, 
pi = '961,3638 + -991,4412 . 


Solved by least squares these equations give for type equations : 
pi ='946,8364 + 1°381,0793 n, 


p, = '948,2960 + 1:°489,7513 n, 
leading to: 
p, = '928,2876, gq, =°071,7124, 
po = "9764736, qo = '023,5264. 
Whence from (vii), the first of (v) and (iv), 

pv, ='366,0119, 1—v,=°633,9881, 
v, = '449,01238, 1—v,=°550,9877, 
vy, =571,0011, 1 —v, ='428,9989. 


Thus about 7°/, of the eggs laid by hens of the brown-laying gens will be 
green, and only about 2°/, of the eggs laid by hens of the green-laying gens will 
be brown. Further the green-laying gens is far more fertile than the brown- 
laying gens, the proportion of brown to green layers falling from 57 to 43 in the 
single clutches to 37 to 63 in the triple clutches. The following is our analysis 
on this basis. — 
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Single Kggq Nests. 
Observed B74, G63. 
Theoretical B74, G 63. 
Number of brown egg layers 78°23. 


a green Ms 58°77. 
Number of brown egg layers with brown eggs 72°62. 
» ” » with green eggs 5°61. 
Number of green egg layers with green eggs 57°39. 
r > » With brown eggs 1°38. 
Two Egg Nests. 
B BG @ 
Observed 67 19 92 


Theoretical 68°92 15°15 93°93 

Number of brown egg layers 79°92. . 

. green i 98-08. 
Number of brown egg layers who lay both eggs brown 68°87. 

» green » vs » 0°05. 
Number of brown egg layers who lay one brown, one green 10°64. 

» green : i » » 451. 
Number of brown egg layers who lay both eggs green 0-41. 


* green - 5 Pa 93°52. 


Three Egg Nests. 
B BG Be? G3 
Observed 62 8 14 19 


Theoretical 59°43 13°98 9°72 119°87 
Number of brown egg layers 74°30. 


- green . 128°70. 
Number of brown egg layers who lay 3 brown eggs 59°43. 
= green am ‘ 0:00. 


Number of brown egg layers who lay 2 brown and 1 green 13°77. 
i green ‘5 5 5 i 0-21. 
Number of brown egg layers who lay 1 brown and 2 green 1-06. 
” green ” ” ”» ” 8°66. 


Number of brown egg layers who lay 3 green eggs 0°08. 
o green y x ms 119°84. 


It will be noted that the theory gives for the B’G and BG? about inverted 
proportions. It also falls short in the BG group. These would very probably 


have been bettered with a more general solution of our six equations. But are 
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the existing frequencies inconsistent with our observations and beyond the limits 
of random sampling? Summing up our results we have : 


Z B G | Be | BG | G2 , Bs | BG | Ba G3 
seat ————— rag | 
Observed | 74 | 63 |67 [19 |92 |e2 | 8 [14 | 119 
| Calculated 74 G3_— B92 15°15 | 93-93 59-13 13-98 972 119-87 

: lia 


From these we find y? = 5°631, giving P = ‘688, or in 69 trials out of 100 the 
sample would be more discordant from the calculated than the actual observations. 
There is accordingly nothing to be said against the theory on the ground of its 
statistical improbability. 

Again of the two hypotheses involved, (1) the greater fertility of the green 
egg layers, (ii) the fixed small probability that a hen of one gens will lay occasion- 
ally an egg of the colour of the other gens, the first seems not unreasonable; the 
second gives merely a quantitative measure of the assumption made by a number 
of ornithologists that birds can lay eggs of two colours. It assumes, however, that 
as a rule they do not. Clearly we need to know more of the mechanism of egg 
coloration before we can settle how it happens that a bird usually staining its 
ege brown will stain it green on a few occasions. If it be a result of type of 
food, we have to assume that our two gentes feed as a rule differently, which is 
not easily to be admitted. Will this feeding habit then be hereditary and if so 
are the male birds also divided into two gentes and is the mating assortative ? 
Granted on the other hand that it is not due to food, but to differences of pigmen- 
tation mechanism, we are compelled to ask whether this mechanism is inherited 
only through the female. If not, then are the matings within the gens, or what 
is the pigmentation mechanism of heterozygote hens? If we could establish the 
existence of the two gentes each with its rule and its fixed exception to rule; if 
further the pigmentation mechanism as one must decidedly expect from the eggs 
of many species is markedly hereditary, then it is possible that in these clutches 
of composite colour lies the solvent of some difficulties which the Mendelian 
explanation meets with when the product of two protogene zygotes instead of 
being protogene is in rare cases found to be allogene. 


(5) The Organic Correlations. 

We devote this section to a consideration of the degree of relationship between 
size, shape and colour characters of the same egg, and their relative values in the 
seasons 1913 and 1914. 


(i) Mottling and Breadth, Length and Index of Egg. 


The value of the correlation of mottling and breadth in the 1913 census was 
‘1803, but unfortunately the sign of it was possibly wrongly given, as may be seen 
from the Table p. 150 of the former paper (Biometrika, Vol. x.). We have taken 
occasion already to refer to the difficulties in the mottling scale used, but after 
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much consideration we are unable to substantially modify the assumed order of 
mottling of the previous paper. In broad lines we have: 


TABLE XII. 

| Mean Breadth 

Mottling 

| 1913 census (291 eggs) 1914 census (1108 eggs) 
Confluent blotches, d+e+g 2°97 3°03 
Transition Forms, a+6 ... 2°97 2°95 
Discrete, Copious, ¢ ome | 2°99 3°02 

| Discrete, Sparse, 2+/f+7 2°96 3°01 

| 


The value of polyserial 7 corrected for class index correlation of mottling is 
‘1753 for the census of 1914. It is therefore certainly within the probable error 
of the difference. Now in both cases the confluent mottling gives a greater 
breadth than the discrete and sparse mottling, but the transition forms a + b, 
and c, are anomalous. The correlation ratio 7 in both cases is significant and 
shows a relation, not very intense, between mottling and breadth, but in the 
present stage of the mottling classification it is certainly not possible to unravel 
the relationship. The 1914 returns undoubtedly seem to indicate that not only 
the confluently but the discretely mottled eggs have the greater breadths, the 
lesser breadths being found in the transition forms. It should be noted that the 
returns for 1914 being nearly four times as numerous are worth twice as much. 


If we could really lay any stress on the sign to be given to the association, we 
should have to assert that in the species at large the rule is opposite to that for 
the individual hen. In her case the broader egg has less mottling, while in the 
species the broader egg has the greater or at least the more confluent mottling. 
The former relation overrides any result to be obtained from the species as a 
whole, and seems to oppose any theory that greater pressure during transition 
through the oviduct is the source of greater mottling*. 


We have further worked out the association+ of Index and Length to the 
Mottling. We have 


Census Census 

1913 1914 

Mottling and Breadth 18038 1753 
Mottling and Length — 0937 > 7) = 0850 + °02038f. 

Mottling and Index 1550 1598 


Since the probable error is of the order ‘02 we see that the value is Insigni- 
ficant for length. On the other hand the order of mottling classes in the three 


* The time of transition through the oviduct may conceivably be a factor of greater importance. 

+ Obtained from polychorie 7 with correction for number of arrays and the class index correction 
for mottling. 

+ mp is the mean value of the correlation ratio on the assumption of no association. 
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cases does not appear interpretable. The following table gives the means for 


each class of mottling as specified in Plate VIII of the first memoir. 


TABLE XIII. 
Mottling and Size and Shape of Egg. 


Index | Breadth | Length 
fete) a | 
gq 
=] 6b 69°52 2954 4-256 
Sao 70°60 2-998 4-255 
0) oa 70°68 2947 | .4:170 
ww | @ Taal 3025 | 4228 
ad) 71:86 3016 | 4-200 
o,| f-|- 790 3-001 4-184 
>| h 7212 | 3042 4°220 
Eg 72°65 | 3032 4184 

a 7205. 00) eS OLT 4133 


The series for index in ascending order corresponds roughly to a series in 
ascending order for breadth and descending order for length, but the system does 
not correspond to any easily appreciated mottling order. It appears as if the 
fieldworkers might have been influenced by shape of egg, instead of merely 
comparing the nature of the mottling in selecting type. At any rate in this 
section no final conclusions can be drawn, and it seems very desirable that more 


elaborate descriptions of mottling should in future be carried out. 
(i) Ground Colour and Breadth, Length and Index of Egg. 


The following scheme gives our results. The first value of 1 is the uncorrected 77, 
the second the value when corrected for number of arrays and class index corre- 
lation, which is ‘9785 for brown and °9860 for green eggs. 


TABLE XIV. 

Index Breadth Length | 

rea ioe | 

Brown Green Brown Green | Brown Green | 

——s EE | | ee | 

” ‘1747 1733 | "1348 "1385 “2061 1432 | 

n 1011 1313 imaginaryt ‘0773 1530 ‘O857 
~ | f 1482] f +1160 | 1432 | f ‘1160 | f 1432 | f -1160 
No ( £0322 | | +:0261|) | +:°0322 (+0261 | | £°0322 | | +:0261 


It will be fairly obvious from this table that there is no association of ground 


* Mean value of 7 supposing no association. 
+ This signifies that if 7,2 be taken from 7’? the difference is negative, i.e. 7’ is less than the mean 
value of 7 for zero association. 
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colour, whether green or brown, with either size or shape of egg*. This does not 
appear at all unreasonable if we assume the ground colour to be deposited before 
the egg enters the oviduct or the shell becomes finally hardened. 

The general conclusion therefore to be drawn from the present investigation is 
that intensity of ground colour, whether green or brown, has no relation to egg 
size and shape, but that breadth of egg, whether considered directly or through 
the index, is more probably related though not intensely to mottling, but the 
nature of the relationship must be obscure until a more elaborated classification of 
mottling has been adopted. 


Gu) Relation of Mottling to Ground Colour. 

The data are given in Table G* at the end of this memoir, where we have 
separated brown from green eggs, because it is conceivable that the relationships, 
if any, for the two categories might be different. If C, denote the mean contin- 
gency when there is zero association we have : 


For the Brown Eggs : C, = °2830 + 0323. 
Uncorrected Contingency: C, = *2030. 
For the Green Eggs: Cy="2118 4+ -0261. 


Uncorrected Contingency: C, = ‘2557. 

Thus for the brown eggs there is no significance in C;, it being less than the 
mean value of the contingency, when there is no association. For the green eggs 
CO, is greater than C, but the difference is less than twice the probable error of Os 
we cannot therefore assert any real relation to exist between mottling and intensity 
of ground colour+. Under the circumstances of the above relation of C, to C,, it 
did not seem necessary to correct C,, as such correction would not alter the con- 
clusion of no significant association. Although the intensity of ground colour 
may have no relation to mottling, it is conceivable that the colour of the egg may 
itself have relation to mottling or indeed to intensity of ground colour, i.e. a brown 
egg may have deeper tones of ground colour and denser mottling than a green egg. 


We have the following biserial tables to illustrate these cases. 


TABLE XV. 
Mottling and Colour of Egg. 
Mottling Categories. 


Colour of Ege | g+d | e a+b | c+h | fi | ¢ Totals 


Brown 215 57 11 437 
Green 300 98 23 669 
Totals 515 | 155 1106 


* This statement is not really contradicted by the 7="1506 of p. 148 of the previous memoir, for 
with the small number 291 eggs of that census 4)='1655 +0395 !, so that 7 is less than the value for 
zero association. 

+ Examined in the same manner the result for 1913 appears not to have the significance we 


attributed to it. We have C.=-2813+-0395, while the corrected contingency is only C)="2260. Thus 
C, is actually less than the mean value when there is no contingency. 
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The order of mottling categories seems to correspond as closely as we can 
determine from the plate to the order of relative amount of mottling. 


We find for the mean n when there is no association : 
a = 004521. 
Hence 7’ corrected for number of arrays but not for class-index is given by 
_ 004,702 — 004,521 
996,383 
or ' n = 0135, 
This is insignificant and therefore we need not trouble to find the class-index 


correction. It would not appear therefore that the brown eggs are more densely 
mottled than the green eggs. 


} 


= ‘0001817, 


‘We now pass to intensity of ground colour. It will be remembered that two 
scales were formed of ‘values’ giving as far as possible equal values by the same 
letters for both green and brown colours. 


TABLE XVI. 


Colour and Value. 


Ground Colour Values. 


E | 5 a 
Colour of Egg | 4 | B C D | B | F+@| H+I+K | Totals 
Brown ... | 52 | 76 63 95 51 | 44 56 437 | 
Green ... | 34 | 51 71 133 85 | 154 141 669 
Totals: ... | 86 | 127 134 | 298 | 136 198 | 197 1106 | 


It is clear on the face of this table that the percentage of high values in the 
brown series is far greater than in the green series, which has a much greater 
percentage of low-colour values. To get an appreciation of this association we use 
biserial 7. We have for zero association 

RF = 005,425, 
while uncorrected 7’? = (113,734. 
Accordingly corrected for a number of arrays 
<4, 118, 134— "005,425 
cae 995,479 
leading to n’ ='3299. 
Calculating the class-index correlation, we find it °9674 and thus finally 


corrected 


= '108,8009, 


n= 3410 + 0197. 


bo 
bo 
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This is a significant and fairly substantial correlation between colour and 
colour value. 


It would appear as if absence of Sorby’s oozhodeine pigment* also involved 
less copious pigment material in general. 


(iv) 


Organic Relations in Shape and Size. 


The fundamental tables are Tables G to L at the end of the paper. 
correlations are as follows: 


TABLE XVII. 


Organic Correlations in two Seasons. 


The 


Correlation, 1914 


Character Pair Symbols 
; | 

Length and Breadth | £,B 
Longitudinal and Equatorial Girths | G, Gy 
Length and Longitudinal Girth L, Gy 
Breadth and Longitudinal Girth BG, 

Index and Length ... | J, L = 

Index and Breadth ... ae a2) 

Index and Longitudinal Girth L,G, 


Correlation, 1913 


— 4496 + ‘0161 


(c. 1110) (c. 294) 
2104 + ‘0193 2220 + ‘0374 
“5139 + ‘0149 ‘5297 + 0284 
"8515 + 0055 “8804 + ‘0088 
“4840 + ‘0155 5216 + 0286 
‘7577 £0086 | —°7284+°0185 
“4537 + 0161 “5033 + 0294 


— °3832 + 0336 


The following table contains the seasonal difference and its probable error. 


TABLE XVIII. 


Seasonal Change in Correlation. 


Character Pair | A=1914—1913 | Probable error of A | 
LI and B — 0116 + 0421 
G, and G, — 0158 + °0321 
Z and G, — ‘0289 +°0104 
B and G, — ‘0376 | + °0325 
I and L — 0293 | +0204 
I and B — 0496 | + 0335 
Zi and G — 0664 + °0373 


With the exception of the correlation of Length and Longitudinal Girth none 


of these differences has a significant relation to their probable errors. 


In the case 


mentioned, however, such a deviation would occur in excess 3 times in 100 trials 
and in defect 3 times in 100 trials, or as we have made 7 trials the odds against it are 
only 52 to 48. 
seasonal change in the organic correlations is to be observed between 1913 and 1914, 
As there were considerable changes in the means (see our p. 310) this result confirms 


* « On the Colouring-matters of the Shells of Birds’ Eggs,” 


p. 359. 


We cannot therefore lay much stress on it, and conclude that no 


Zoological Society’s Proceedings, 1875, 
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the general conclusion that except for very skew distributions, change in means 
does not involve change in correlation. Change in variability does usually denote 
change in correlation, but as we have indicated (p. 3811) the changes in varia- 
bility are not significant except in the girths, and this may be the source of such 
modification as we find in the correlation of Length and Longitudinal Girth. To 
test this we note that if Longitudinal Girth only be changed the regression 
coefficient of the Length on the Longitudinal Girth ought not to be changed 
within the limits of random sampling. For 1914 this coefficient of regression is 
4501 +0056 and for 1918 is ‘4215 + 0089. Hence the difference is ‘0286 +0105. 
Thus the difference in the regression coefficients is just as significant as it was in 
the correlation coefficients, or 1s not explicable on the basis of increased varia- 
bility in the Longitudinal Girth. If it is, which we doubt, to be considered 
significant it must depend on something else than a more variable Longitudinal 


Girth. 


We may consider in this place what changes have taken place in the formula 
connecting Longitudinal Girth with Length and Breadth. For 1914 we have : 


G, = 1:1273 B+ 1:4840 L + 1:9180, 
while for 1913 we had: 
G,=1:2701 B+ 1°6415 £ +8224. 


The changes in the coefficients look more considerable than the changes that 
will be found in the values for G; calculated from either formula for eggs which 
are not extreme variants. At the same time the differences rather tend to 
emphasise the suggestion given by the correlation of G; with L, that there may 
have been a seasonal change in the organic relationship between these characters. 


(6) The Homotypic Correlations. 


The results for the 1914 season are of a very startling character ; they demon- 
strate that while the organic correlations remain nearly constant the homotypic 
correlations can suffer a very considerable seasonable modification. In other 
words the birds laid eggs very much more alike in 1914 than in 1913. The 
reader will remember that 1913 was a bad season for the birds, many young 
perished and there were few nests. On the other hand 1914 was a good season ; 
there was plenty of food, numerous and possibly stronger birds. The eggs in the 
clutches were more alike in 1914 than in 1913. 


We proceeded to investigate in the first place whether the greater intensity of 
homotyposis was due to there being a far larger proportion of three-egg clutches, 
Accordingly we took only the Ist and 2nd eggs in the clutches and obtained the 
homotypic correlation for Equatorial Girth. It was °7535, for 383 pairs of eggs. 
When we took all possible pairs out of all the clutches we had 796 pairs, and the 
correlation instead of rising, fell, but insignificantly to "7469. The difference 
between 1913 and 1914 cannot therefore be due to a far larger number of clutches 
providing three pairs in the latter than in the former year. 

22—2 
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Direct Homotyposis in Size and Shape. 


TABLE XIX. 


| 
Characters Symbols Season 1913 Season 1914 
Lengths of Eges [L, L 4643 + 0346 6056 + -O0107 
Breadths of Eggs... B, B 5176 + 0326 ‘7327 + :0078 
Longitudinal Girths ... G), Gy “5076 + °0327 “6689 + :0093 
Equatorial Girths G,, G, | *4621 + :0350 “7469 + ‘0075 
oe SS SS =a) = 

| | 

Index | 100 B/L, 100 BIL | 5537 + ‘0308 *5327 + 0120 


It must at once be admitted that this result is of a very startling character. 
Only the homotyposis of the Index has remained without any significant change, 
i.e. the degree of likeness in shape does not exhibit a seasonal change; in all four 
cases of absolute size there are most substantial and of course significant changes 
in the homotyposis. The mean size homotyposis has risen from °4879 to °6885, 
Le. by about 40°/,! It is difticult to offer a demonstrable explanation of this 
great change. The factor we are seeking for must be one which modifies so to 
speak the individuality of the bird between its successive egg layings. For 
example, a change in the climatic condition or in the food supply occurring in 
1913 somewhere during the egg-laying period. Such a factor, however, would 
lead us to suppose that the high values of 1914 were the normal homotypic 
values, whereas they appear to us from the comparative standpoint to be the 
abnormal. If we suppose only the stronger birds survived to the season 1914 and 
that there was a plentiful food supply, it would seem that the community as a 
whole should have exhibited less individuality in size and not more,—the weaker 
birds obtaining less food supply would not appear. There is, however, so little 
change of type and variability of the eggs in the two seasons that it is hard 
to believe that selection of the birds is the source of the change. Further if 
anything the variability of the eggs is less in 1914 than 1913, and such reduction 
of variability would tend to reduce rather than increase correlation. If we suggest 
that 1913 killed off many of the old birds and that there was a larger proportion 
of young birds in 1914, so that there was a more heterogeneous community 
in 1914, we are pulled up by the fact that the eggs were on the average very 
slightly larger in 1914, which is, perhaps, not what we should anticipate with a 
larger proportion of first layers. It would seem as if we had to take refuge in 
some very vague statement that the seasonal environment for 1914 interfered 
less with individuality than that of 1913. But this does not really help us and 
leaves us with the greater difficulty, that it suggests that ‘individuality’ is an 
indefinite quantity from the statistical side and might result under favourable 
environmental conditions in all the eggs of a clutch being perfectly alike! The 
persistency in the Index value seems in itself to point to a limitation in in- 
dividuality, and it seems wisest at present to await further material before 
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speculating on the source of this marked seasonal change in size homotyposis. 
One point, however, we can investigate, namely, whether pigmentation homoty- 
posis has or has not kept pace with size homotyposis. 

With this aim in view the direct homotyposis has been worked out between 
mottling of one egg and mottling of a second in the same clutch, and between 
ground colour of one egg and ground colour of a second in the same clutch. 
Further the cross-homotyposis has been determined between the mottling of one 
egg and the ground colour of a second in the clutch. The fundamental difficulty 
here lies in the treatment of the ‘values’ of the ground colour. We cannot 
separate green eggs from brown, because of the occasional appearance of mixed 
colour clutches. Nor would it be reasonable to work with contingency on a 
20 x 20 category table. We have accordingly been compelled to pool green and 
brown eggs, when they have the same ‘value’ on our colour seale. This at any 
rate renders our present results comparable with those of 1915. But until we 
know more of the mechanism of egg pigmentation it is impossible to assert that 
equal ‘values’ in brown and green ground colours are what we should anticipate 
as a result of individuality working occasionally with one and occasionally with 
another pigment. The homotyposis pigmentation tables are given as Tables 
R, S,and T at the end of this paper. In actually determining the contingency we 
have clubbed d and e in the mottling together, and A, and A,, B, and &,, 
C, and C,, etc. in the value of the ground colour, thus reaching 8 x 8, 10 x 10 and 
10 x 8 contingency tables. These have then been corrected for number of cells 
and for class-index correction. The class-index correction for mottling is 9531, 
and for value of ground colour ‘9848. 

We consider first the cross-homotyposis of ground colour and mottling. The 
coefficient of mean square contingency on the supposition that there is no asso- 
ciation between value of ground colour in one egg and mottling in a second 
would be a 

Cy 2 Olen. 

The corrected actual coefficient of mean square contingency is 

ee, 


-which is less than the mean square contingency coefficient for no association. 
Accordingly there is no cross-homotyposis between mottling and ground colour, 
and there should not be if our view be correct that the organic relationship in 
the same egg is zero (see p. 328). 


The value found for the 1913 data was 
C, = 3989 + 0379, 


and was spoken for as significant. But the fact was overlooked that 


C, = 3169 +0451, 


‘so that C,1s less than twice the probable error greater than C,, and may well not 
be significant. This conclusion is confirmed by the consideration that the organic 
correlation of mottling and ground colour was really insignificant in 1913, and 
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thus it is exceedingly improbable that the cross-homotyposis could be significant*. 
Direct homotyposis provides the results of the following table : 


TABLE XxX. 
Direct Homotyposis in Mottling and Ground Colour Value. 


| Character Season 1913 | Season 1914 


Mottling of Eggs in same Clutch *3500 6267 


| \ 


The trchatlere errors ae me 1913 values are well below ‘045 and of the 1914 
values well below ‘017. Accordingly the differences are markedly significant, or in 
the nature of pigmentation the resemblance of eggs in the same clutch is much 
more intense in 1914 than in 1913. Thus the results for size and shape of egg are 
confirmed by those for pigmentation. We have therefore this very remarkable 
fact—a fact which it seems to us may be of some consequence—namely that the 
season can affect the extent to which the female bird impresses her individuality 
on the eaternal characters of the egg. It does not follow from this that seasonal 
differences can affect in the ike marked manner the individuality of the internal 
characters of the egg. But it does raise the suggestion that it would be well 
worth inquiring whether the degree of resemblance of offspring born in one 
season can differ sensibly from the degree of resemblance of those born in another 
season. Should such a difference be established, it would indicate that heredity—in 
other words the nature of the germ plasm—could be more readily influenced by 
seasonal differences than has yet been anticipated. We ourselves should be very 
unwilling to admit this, but we must at the same time confess that we see no 
obvious explanation of these significant changes in homotyposis. If individuality 
impressed in the ovary and in the oviduct on the form and colouring of eggs can be 
increased or decreased by seasonal differences, it is not a very long step to believe 
that other physiological processes of this region which impress individuality on the 
internal characters of the ovum can be modified by the nature of the season. 


_ Ground “Colour Value of Eggs in same Clutch | 5709 “7480 
I 


We now turn to the cross-homotyposis in size and shape of the tern’s egg: 


TABLE XXI. 


Cross-Homotyposis in Size Characters. 


| Characters of the two Eggs Season 1913 | Season 1914 
, ears ea 
Length and Breadth. . : *0922 + 0441 *2621 + ‘0157 
| Longitudinal and Transv erse e Gir ane *2603 + 0413 *4546 + ‘0134 
| Length and Longitudinal Girth — ... 4229 + 0362 5854+ °0111 
| Breadth and Longitudinal Girth ... 2530 + 0416 *4162 + °0140 
| | | 
| | 


* See above our second footnote on p, 328, In 1913 we had not fully realised how high Ce could 
be for such short samples as a couple of hundred. Hence the source of the error. 
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We are thus again faced with the fact that the cross-homotyposis of the eggs 
of 1914 is substantially higher than that of 1913. We still see the markedly 
emphasised individuality of the female birds. 

We have next to enquire whether, the organic relations being practically 
constant, the cross-homotyposis has increased in proportion or not to the direct 
homotyposis. We can test this by Pearson’s suggested relationship*, namely 
Cross-Homotypic Correlation of x and y= {correlation of « with x + correlation 
of y with y} x {the organic correlation of # with y}. The following table gives the 
calculated and observed cross-homotypic correlations for the seasons 1913 and 
1914. 


TABLE XXII. 


Cross-Homotypic Correlations as Calculated and Observed. 


Season 1913 Season 1914 


Character Pair of two Eggs — 
| 
Calculated Observed | Calculated | Observed 


Length and Breadth... ea “1090 0922 1408 2621 | 


| Longitudinal and Transverse Girths | +2568 ‘2603 3638 | 4546 
Length and Longitudinal Girth ... "4278 4229 5426 =| » 5854 
Breadth and Longitudinal Girth ... 2674 2530 "3392 4162 | 


Thus while the calculated values were in excellent accordance with the ob- 
served in 1913, they are very inadequate to express the increased individuality in 
1914. In other words the cross-homotyposis appears increased even at a greater 
rate than the direct homotyposis which we have shown in itself to be markedly 
emphasised. 


What we are accordingly confronted with in the season 1914 is an exuberance 
of individuality and the possibilities which such a variation of individuality 
suggests. It may be confined to the externals of the egg, but the physiological 
factors which determine those externals must at least be in close proximity and 
may, perhaps, be affiliated with others which affect matters much more important. 
The approximate constancy of type, variability and organic correlation for these 
two seasons coupled with the marked change in homotyposis is a problem which 
demands further observations and much hard thinking: 


* Phil. Trans. Vol. 197 A, p. 290. 
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Plate VI. Common tern sitting; objects to giving a sitting to the photographer. 


LIST OF TABLES APPENDED. 
Table A. Organic Correlation. Mottling and Breadth. 


eB ss Mottling and Length. 
yy. Ge - Mottling and Index. 
IDE - = Value of Brown Ground Colour and Breadth. 
D°. » " = Green a F os 
ame Oi 5 Hf i Brown ¥ Length. 
rae ie S - ee Green * . sa 
eo tn * - | Brown . fs Index. 
pean : . . Green i ~ 
eo ee: - . ; Ground Colours and Mottling. 
ee oie * . Length and Breadth. 
H. m i Longitudinal and Transverse Girths. 
a ai < Length and Longitudinal Girth. 
ena .s i Breadth - F; 
» K. - . Length and Index. 
pel De i ‘ Breadth_ ,, P 
yo eb - Longitudinal Girth and Index. 
M. Direct Homotyposis. Lengths. 
a) aN: . P Breadths. 
7 Oo} . ¥ Longitudinal Girths. 
Ue x . Transverse Girths, 
Nel: ¥ Indices. 
eat . - Mottlings. 
i SE . 5 Values of Ground Colour. 
, T.  Cross-Homotyposis. | Mottling and Value of Ground Colour. 
cA, WH Ob - , = Length and Breadth. 
eee s 3 Longitudinal and Transverse Girths. 
» Ww. 4 , Length and Longitudinal Girth. 
ae: y , Breadth 


” ” 


uraoy “Ay {q sydeasojoy 
“JUITIUOATAUA O}VIPAUUT JO TOJOVART|D PUB pasu [BLIdZRUI JO asuvI 


Plate II 


ee 


a 


£ 
eo 
» 


Biometrika, Vol. XII, Parts III and IV 


ayy SuTyeo 


Ipul Udo yf, WOULMOL) 9t{¥ Jo SJSON 


Plate III 


Biometrika, Vol. XII, Parts III and IV 


‘muo1s@090 auo uo A[UO punoy ‘S30 pwordAyy ‘9 
“YON]d ayy UIAIM AJISIOATP 9IBAYSHIIt s3A0 esau, “Yoynjo puoses w jo saaq ‘q ‘9 “Q 


‘uBMoy WBA, 4q poquied ‘uieay, TOWIMOD eUy Jo s83q 


“yoyN]o BuO JO 8H 


sup ‘p ‘y ‘p 


lith. Cambridge University Press 


Pa 
1 
, 


Biometrika, Vol. XII, Parts III and IV Plate IV 


(a) 


Common Tern just alighting, to indicate great length of wings. 
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TABLE A. Organic Correlation. Mottling and Breadth. 
Mottling*. 


| Breadth ce NN ba ee | =a e It g | ie ee Totals | 


| 


287 | 155 oO lines | 


* Two eggs with no recorded mottling. + 2-60— contains all breadths from 2°595 to 2°645. 


TABLE B. Organic Correlation. Mottling and Length. 
Mottling*. 


Ug} Ce 
i ¢ 
i a 


woltrwoel It | 


on 


Se Se SS C 
St 


jy 85— 


| | mb ww Oe 


“SO— 


wR YS 
) 


SS 
Or 
| 


es 


| Totals Omn| | 


* No mottling is recorded in the case of two eggs. + 3°25— embraces all lengths from 3:245 to 3°295. 
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TABLE C. 
Organic Correlation.  Mottling and Index. 


Mottling*. 


* Two eggs with no recorded mottling. 
+ 58— contains all eggs 57:95—-59°95. 


TABLE D*. 


Organic Correlation. Value of Brown Ground Colour and Breadth. 


Value of Ground Colour. 


Breadth 


2-60— | == == 
265— — | —- _— == = — 
270— 1 Ll}; — a= _ — = 
2-75 - 4 2 1 2 | 
2-80— 2 1 3 1. | '5) = 1 
2°§5— 4 5 2} 4 3 ] 2 
2-90— Ti 6 6 9 4 2; 4 
9 13 8 16 | 6 4 1 
12 15 18 27 9 4 4 
4 5 
2 1 
L ee 
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TABLE D*. Organic Correlation. Value of Green Ground Colour and Breadth. 


Value of Ground Colour. 


| 0; 7 Totals | 
[ 1 ace — — 1 
2: | i} = 1 a a = ee ee 2 
Up | 2) — | ~- 1 l | | 4 
2 ie 1 hah ee”? = 1 hae eo es 9 
2: i 1 1 ee ee 6 3 2 3 2 22 
2 = 2 7 5 5 4 5 4 5) 2 37 
2: 3 6 13 20 | 10 9 3 6 9 3 82 
2 5 16 7 94 | 16 12 10 4 | 9 2 105 
3° 9 10015 26 | 22 16 11 13 | 11 4 137 
3 6 i 10 S20 ein ly 17 6 13 9 138 
a: 7 We od 10 184 5 | 11 14 7 8 | 3 87 
3° ii pee 2 5 as) 6 2 4 5 5 32 
32 — ; 1 — 1 | 2 3 2 — 1 2 12 
3: = ee 1 = = = = =2 = — 1 

Totals | 34 | 51 | 71 | 133 | 85 | 85 | 69 | 47 | 62 | 32 

| 2 1 


TABLE E*. Organic Correlation. Value of Brown Ground Colour and Length. 


Value of Ground Colour. 


Length | 4: | Bi | OC, | n| Hh! A | & | H, | t | Ky | Totals | 
ee 
| g25— | - | = Eas ee a Sa epee 1 
0 = =a 0 
3:35 = = 0 
340 0 
3 45— Oo. 
3°50 — = 1 
Bp _ 0 
3°60— = 6) 
3°65— — ss 2 
3°70— = == 0 
Sb as 2 a 2 
3:30— = 1 =S il 3 
3°85. 1 i = 3 7 
3-00 2 2 1 1 14 
3°95 2 5 2 3 1 7 
400— 5 5 3 1 1 24 
405— 6 6 2 2 3 38 
10 8 18 11 3 4 64 
hs — 0 4 3 3 47 
4° 20— i 2 — ] 50 
Yeo — ie ® 1 2 39 | 
430 1 yi 3 1 47 | 
Bb 2 7 1 1 22 
4 4O— 7 1 =) 23 
Le— 3 3 1 14 
450 2 3 = 13 
4h b5— 1 — _- 4 
4°60— — oa — 3 
| 4°65 = ay 2 
4 70— _- — — 0 
45 — 1 a — 1 
480— = 0) 
4 85— 1 1 | 
Totals 439 
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TABLE E”. Organic Correlation. Value of Green Ground Colour and Length. 
Value of Ground Colour. 


Length | 4, | B | G@ | BD] & | | & | Bw | db | Ky | Totals 
BOE 0 
3°30 1 | 
3°3h— ] j 
3 40— 0) 
B45— 0) 
3°50 0) 
3°6h5— oe] 
2°60 0 | 
365— 1 — = = 2 
| 3-70— 1 = 2 = 1 2 4 
315— = 1 1 = 1 1 4 
3°80— — = 1 = 1 = 3 | 
a 1 2 3 2 1 — Ik | 
3-90— 1 4 6 = 2 ] 19 
3:°95— — BY 3 — 2 4 16 
4:00— 3 4 6 10 3 3 43 
4°05— 6 5 9 8 8 3 59 
Wer) 5 7 20 9 8 7 80 
te 5 8 12 8 6 4 71 
4:20 — 8 1 24 7 8 5 105 
4 25— 2 3 5 7 5 2 36 
4 30— 9 vi 8 12 4 71 
4°85 -- 4 5 al 2 4 37 I 
a0 1 6 2 4 1 y 
|  4£°45— 4 3 5 1 2 36 
450— D 3 1 1 1 17 
4 55— = 2 3 1 3 13 
RG — 1 2 1 1 1 8 
| 4°65 1 : 1 
| 4°70 s 0 
VS p= ae 1 
| 4-80— — 1 
| 485— 1 1 
| Totals 


TABLE F*. Organic Correlation. Value of Brown Ground Colour and Indes. 


Value of Ground Colour. 


| 


= bb we Lo 
me be & Ub bo 
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Totals 
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TABLE F". 
Organic Correlution. Value of Green Ground Colour and Indes. 


Value of Ground Colour. 


idee | AS |B, oc, PD, | | 1G. |B, | |x, (Totals 


Dd 
a 
=ocnr | 
| 
pat tet 
I~IO WO 


mS pp 
Ob Hu wate 


bad 
© 
— 


mors 
Wisc 


| | 


ies) 
< 
Th 


Spe 
86 
88— 
90—- 


| gQg— 


Totals 34 51 pil 


133. | 85 | 85 | 69 47 62 32 


TABLE Gi. 
Organ Correlation.  Mottling and Ground Colour Value. 


Value of Ground Colour*. 


‘Mottling+ »| B,| | Fl B 
a 2 ra a a _ 9: 
b 3 eee he 3B] 2 2 
c 20 | | 27 | 33 38 | 9 9 
d 1 ae ie! Mee i ae 
e il | 61 || U5 } 21] 5 2 | 
if 7 8 8} 3 
9 3 b 30) 5 5. — =| 
h 3 } 1) — 4) 1 | 
i 2 iL Atal VN aa 1 | 
——S 1 ’ . : = 
Totals | 52 | 34 | 76 | 51 | 63 |-71 | 95 | 133] 51 | 85 | 21 | 85 | 23 | 69 | 19 | 47 | 19 | 62 | 18 | 32 | 1106 


* Two eggs, one given as ‘slatey grey’ and the other as blue, have no ground colour value recorded. 
+ Two eggs have no mottling recorded. 
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Organic Correlation of Longitudinal and Transverse (irths. 
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Organic Correlation of Breadth and Index. 
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TABLE N. 
Direct Homotyposis. Breadths. 
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Longitudinal Girths. 
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(1) Introduction. 

When several mental tests are applied to a group of subjects, and the correla- 
tions between the tests (taken in pairs) are worked out, the coefficients are as a 
rule found. not to be arranged entirely in haphazard order, but to show a certain 
degree of what has become known as hierarchical order. This means that if the 
total correlation of each test with all the others is found by adding together its 
coefficients, and if the tests are then arranged in sequence according to the order 
of magnitude of this total correlation, they are found to be also in sequence, or 
nearly so, according to the order of magnitude of their correlations with any one of 
their number. 

If the correlation coefficients are set out, as is convenient, in a square table 
such as the following, the letters #,, 7, etc. being the names of certain mental 
tests, and the quantities 7,, 7, etc. the correlations between the marks scored in 
these tests, then hierarchical order shows itself in the fact that each coefficient is 
smaller than that on its right or than that below it, provided the tests have been 
arranged in sequence according to the magnitude of the total correlation of each 
with all the others. 
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The observed numbers in an actual experiment naturally do not in any case 
come out in perfect hierarchical order, and it becomes important to have a measure 
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of the degree of perfection present, and some means of estimating from what 
“true” correlations the observed numbers are most probably derived, and the 
degree of hierarchical order among these “true” correlations. The importance of 
this matter arises in the Theory of General Ability which has been proposed by 
Professor Spearman, for that theory can only be considered proved if the correla- 
tions are derived from an absolutely perfect hierarchy. A merely high degree of’ 
hierarchical order can be attained without any General Factor whatever, by the 
random selection of Group Factors. The very difficult question therefore arises of 
deciding (if possible) whether the hierarchies actually observed in experimental 
psychology are more probably derived from perfect hierarchies such as are postu- 
lated in the Theory of General Ability, or from the good but not perfect hierarchies 
which arise in the Theory of Group Abilities*. 

A criterion which, it was hoped, would give such a measure of the perfection 
of the true hierarchy from which the observed numbers were derived by experiment, 
and which has been widely adopted for this purpose, was worked out by Dr Bernard 
Hart and Professor C, Spearman in the British Journal of Psychology for March, 
1912. The object of the present paper is to inquire into the accuracy of that 
criterion. 

(2) A Criterion for Hierarchical Order. 
The underlying idea was that if the above square table of correlation coefficients 
shows hierarchical order in any degree, there will be correlation between the 
columns of that table taken in pairs, and that when the hierarchical order is 
perfect the columnar correlation & will rise to unity, except in so far as it is blurred 
by the sampling errors, which obviously cannot increase an already perfect correla- 
tion, but can only decrease it. Let us write dashed letters throughout for the 
true values of the various quantities, which in ordinary experiment are unknown, 
reserving undashed letters for their measured values. We then have: 
r = true correlation coefficient, 
e=its sampling error on one occasion, so that 
r=r+e  * 

ry = mean of the column of true values 7’, 

r =mean of the column of observed values r. 


In finding these means, that coefficient is omitted which has no partner in the 
column with which correlation is being found. Write also 


p =?" measured from the mean of the true column, ie. 
=7’—r', and similarly 
p=r measured from the mean of the observed column, i.e. 
a. 
e=p—p’, =e-@, 
where € is the mean of the column of e’s. 


* See G. H. Thomson, ‘The Hierarchy of Abilities,” Brit. Journ. Psychol. 1919, 1x. p. 337 and 
“«The Cause of Hierarchical Order among the Correlation Coefficients of a Number of Variates taken in 
Pairs,” Roy. Soc. Proc. A, xcv. p. 400 (April Ist, 1919). 
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Then for two columns a and b, the true columnar correlation which we desire 

to know is 
Cy / / 
R = S (p xaP ay) 1 
IRN (1), 
Ava (p aa) (p zb)} 

by the Bravais-Pearson product-moment formula, S indicating summation over the 
various values of z, 1.e. summation up the column. 


This can be written 


Rt —— S (Pra Prd) —8S (€xa Exp) —8 (px Exa) ae) (a Exh) _ = 
: VS (PxaPaa) —-S8 (€xa Eva) — 28 Ne cee Exe )} V{S (PxpPxb) —S (€xb €ab) — 2S Gee Exb)| 


In this expression, the three quantities of the form S (pp) are known. ‘The three 
quantities of the form S(ee) are not known, but an attempt can be made to 
estimate their probable values from the known standard deviations of the correla- 
tion coefficients. The four quantities of the form S(p’e) are treated by Dr Hart 
and Professor Spearman, in their paper, as negligible, on the ground that p’ will 
not in general be correlated with ¢«. It 1s the object of the next section of this 
paper to examine the nature of the correlation of these two quantities. 


(3) The Relationship between the Correlation Coefficients and their Sampling Errors, 
in the Case of Correlation between a Number of Variates taken in Pairs. 


Consider the formula for the standard deviation of a correlation coefficient, viz. 


where NV is the number in the sample. It follows from this that the larger 
correlation coefficients will probably have the smaller sampling errors e, disregarding 
the sign of e for the moment. 

But these signs of the quantities e are not likely to be indiscriminately positive 
and negative. On the contrary, they will have a tendency to be either all positive 
or all negative, if, as-is the case in most of the columns of coetticients considered 
by Professor Spearman, the correlations in the square table are mainly positive. 
The errors in the correlation of a variate 2, with a variate w are themselves 
correlated with the errors in the correlation of the variate a@ with another variate 
#,, according to the formula 
Tealesa (1 a ry, ce Txoa 7 ry, a 27 ya aga Yy,a) 

2A ley) Lee) 
That is, the correlation of the sampling errors of r,., with the sampling errors of 
Yz,q depends chiefly upon r,z,,,. To illustrate, let us take three correlations from 
an experiment in psychology, carried out by Mr Wyattt. 


ol), 


Tax, "ax, V1 He 


- 


* Karl Pearson and L. N. G. Filon, ‘‘ On the Probable Errors of Frequency Constants,” Phil. Trans. 
of the Royal Soc. 1898, cxcr. A. p. 259. 

+ Stanley Wyatt, “The Quantitative Investigation of Higher Mental Processes,’ 
Psychol, 1913, v1. p. 181. 
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If we let x, be the mental test “ Rearranged Letters,” 
Hy 4 , “Missing Digits,” 
a a . , “Analogies,” 
the values there found were 
x, a = 0°63, 
Toa = 0-61. 


Then by the above formula the correlation of the errors of these two coefficients 
depends chiefly upon r,,,,, Whose measured value is 0°63. Using the full formula, 
and employing the measured values in default of the true ones, the correlation 
between r,,, and rz,q turns out to be ‘47. It is therefore (to an extent indicated 
by this value) probable that they are either both too large or both too small. 
The same argument holds, in varying degrees, for the other correlations all over 
Mr Wyatt’s table, which are all positive. They have a tendency to be either 
all too large or all too small: in other words, the e’s tend to be all of the same 
sign. The relationship between the correlation coefficients of a column, and their 
errors, can therefore be summed up in the following table, in which the symbol 
'e| denotes the magnitude of e regardless of sign. 


TABLE J, 
i" | el p' | e or « pe or pe 
2 — = = : — ann 
large | small + - + 2 nis 
+ - + - + 
+ - + - + 
- + = ~ + 
- + = - + 
small large - + = = ae 
| = = a = 
| S (pe) = | —- or + 


The first column shows the true correlations r’ arranged in order of magnitude. 
The second column expresses the fact that the sampling errors on any occasion 
will probably be arranged in the reverse order of magnitude, disregarding their 
signs. The third column shows the correlation coefficients measured from their 
mean. The upper p’’s are then positive, and the lower negative, and also, what is 
not shown in the table, the absolute values increase upwards and downwards from 
the point where the signs change. The fourth (double) column shows the probable 
arrangement of the signs of the quantities «. If the e’s are all tending to be 
positive, then the left-hand member of the double column gives the arrangement, 
while if the e’s all tend to be negative, the other member of the double column 
does so. As shown in the last (double) column, therefore, the quantities p’e tend 
either to be nearly all negative or nearly all positive. For a very small sample 
the signs of p’e will no doubt be quite irregularly arranged. But with such a 
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small sample, even if p’ and e were really uncorrelated, 1t would be most unlikely 
for S(p’e) to be negligible. As the sample increases the signs tend to settle down 
to the above arrangement, and S(p’e) does not tend to disappear compared with 
S (ee), but only to take on one or other of alternative values. It will only be zero 
when all the errors are zero, le. when no corrections are needed to R. The 
distribution of S(p’e) about zero in a number of samples of the same size will not, 
that is, show a maximum at zero, but a minimum, as is shown qualitatively in 


Fig. 1. 


Frequenoy 


- 0 + 
—Sipo — 


Fig. 1. 


To show the order of magnitude of these neglected quantities, consider the 
following example, in which the true correlations are known a priori, and with 
their observed values were as follows: 

ca =0°730, eq, = 0°708, e=—0:027, 

Fag—V'093;° ta, 0108, €=4- 0115, 

T og = 0°356, To = 0367, e=+0°011, 

ei = 0174, ry =0337, e=+0168, 

W ga = 0167, t= 0281, e= +0114, 

Tra =0120, ro=0371, e=+0-251, 

Pra =O116, w,=0112, e=— 0004, 

Yiq =O112, rq =01338, e=+0021. 
The variates here were made up of dice throws, and the sample was one of 36 
cases. Here, knowing as we do the actual true correlations* which would be given 
by the whole population or by a sufficiently large sample, we can form the 
quantities S(ex~) and 2S(p'xaéea). They prove to be ‘064 and —-116. It is 
clearly unwise to neglect the latter of these in comparison with the former. 


(4) Haperimental Demonstrations in Cases where the True Values of the 
Columnar Correlations are known a priori. 
The formula at which Dr Hart and Professor Spearman eventually arrive, after 
neglecting these quantities and making various other assumptions, is 
ie 
R’ S (PaaPar) —(%—1) Pep Ona Fad (5) 
“ab = — = a = == —= ae 7 . aa Sa ee erserers e ) 
ViS (cca) oe (2 —— 1) Gara VS (px) = (1 = ) op} 
where the o's are standard deviations of the correlation coefficients, the bar 
indicates mean values for the column, and n is the number of pairs of correlation 


* G. H. Thomson, ‘A Hierarchy without a General Factor,” Brit. Journ. Psychol. 1916, vin. p. 271. 
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coefficients concerned, in the two columns. In using their formula, its authors do 
not apply it to all the pairs of columns in the square table. They say: “In any 
case the correction must be kept within limits: as usual, the larger the correction 
the less it is to be trusted. If the sampling errors are large enough, they 
eventually will quite swamp the true differences of magnitude upon which the 
observed correlation should be based. In this case, the true correlation is beyond 
ascertainment ; any attempt at correction is merely illusory. To avoid this, and at 
the same time to ensure impartial treatment of all data, it is necessary to fix before- 
hand some definite limit to the feasibility of correction. We have here adopted 
the following standard: in order to attempt to estimate the correct correlation 
between columns, 2 ts required that in each of these columns the mean square 
deviation should be at least double the correction to be applied to that deviation.” 


That is to say, the equation (5) is not to be used unless, in each factor of the 
denominator, S(p?) is at least double its correction (n—1) 8% This condition (the 
“correctional standard”), will be found to be important. 

It is clear that the accuracy of this formula (5) could be conveniently tested 
were we in possession of material in which all the true correlations were known 
a priort, in addition to the observed correlations found in samples. Such material 
is supplied in perfection by correlated dice throws. 

First Example. The first experiment with dice of the above nature which 
I carried out was described in the Brit. Journ. Psychol. 1916, vi. There ten 
variates were artificially made up of group factors and specific factors, without any 
general factor, so as to make a very good hierarchy, which gave the following 
results when tested by the Hart and Spearman criterion. 


TABLE II. 
| 
Columns . : 5 The Hart and Spearman 
passing Obser be ares | que coun corrected columnar 
standard See ee Tre Buln correlation R’ 
—— Bie 2 Se ——— | 
ab 0°95 1:00 1°04 | 
ae 0°89 0:99 1:00 
be 0:91 1-00 101 
cad 0°90 1:00 Ill 
Means 0-91 1-00 1:04 


Here the exaggeration of the Hart and Spearman R’ is not very noticeable, for 
the hierarchy is in any case almost perfect. Indeed in this case I took some pains 
to make the arrangement of group factors imitate a perfect hierarchy very closely, 
for the sake of emphasising the point I then wished to make, viz., that such group 
factors can, unaided by any general factor, approach exceedingly close to perfection 
of hierarchical order. I did not then realise that the pains I took over this point 
were hardly necessary, for random sampling of the group factors gives good 
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hierarchies, though such perfection as the above would be unlikely to arise from 
chance. 


Second Kaample. For a second example I have therefore chosen a hierarchy 
formed thus by the chance sampling of group factors, without any general factor, 
and moreover one which shows considerable departure from perfection of hierarchical 
order, it being the least perfect of those which I have up to the present formed in 
this way. The mode of construction of the variates is given in detail in Roy. Soe. 
Proc. A. xcv. (April Ist, 1919) on page 402, and the theoretical correlations on 
page 403 of that article. The latter show a certain degree of hierarchical order, 
though not very high, the true mean columnar correlation R for all pairs of 
columns being 0°59. 


Dice were now thrown to form 20 measures of each of the ten variates, 
Gree day sy on Dip- 

First the magnitudes of the group factors (which it will be recalled were in that 
article named after the cards of a playing pack) were decided by throwing dice, 
with the following results. 


TABLE IIL 
Number identifying | Name of Group Factor 

the subject ACs 22, Beads he 60 7 78. “90910 “Kn Q . “Kk 
1 Gee letGhe 6b A Aas) ale ea 
2 2 4 3 2 6 4 3 56 6 2 5°3 8 
3 Dee le eed Go ie 45 3) Go 4s 8 
4 6) 6 2) 5 5 5 6 G6 4° 5 «#5. 4 5 
5 § 56 -4 9° 2 5 2 4 .T° 1 4 4 3B 
6 5.6 9 6 1" 4 5-6 6 4°2 °5 4 
2 i> 3.4 6 4 6 (6 2, 3 2 5b) 4 
8 [GRE Oh? mee: leeds Ge 32) 2 
9 Te Ge doe sie 2 2 I a) be 4 ly 5 
10 Ge ad e325 12) A br I ae A BD 
11 DAS AS ODA ES 28 esr a3) 2 8 22 
12 Gio Ome ele eee oe eee reliee 2 2 mel all eG 
13 6a 6) bo I Ge es 4° 4A 8 
14 De 165 eo bo 8 ba 4 
15 Deez SO 22 ro oI 6N 9G io 
16 6 2 6 4 4 6 3 6 4 6 2 2 38 
ile 2S Ie 6S 3 B= 2 62 6G 8 5 
18 4 1 2 4 2 4 3 6 4 6 6 3 5 
19 Gee 6 8 Se Ie oe 25 545s 3 ol 64 
20 Siero 2) Oe 66 2) See Gy IG) 62) 4G 


Using these numbers, we can make up the scores for the group factor portion 

of each of the ten tests described in the article quoted. There results (see Table IV). 

_ The proper number of dice, as described in the article quoted, were then 

thrown for each test and for each subject to represent the specific factors, and the 

scores of these dice added to the scores given in the last table, the resulting total 
being the complete score for each subject in each test (Table V). 

From the dice scores the observed correlations between the variates can be 
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TABLE IV. 
Number identifying | Scores in the group factor portion of the tests 
the subject | v, Ce ye aN eG ais yh Le aienlo| eee 
| 
1 20.915 «37 1) 25 31 42% ee oe 
2 18 20 42 3 2 36 46 4 32 40 
3 1 19. 40, 3 22° -34 43 7 270935 
4 28 24 59 +4 38 45 64 6 46 54 
5 19 10 39 4 27 2 44 65 34 38 | 
6 21 21 52 5 32 39 #56 6 39 45 
a 18 22 43 5 29 38° 47 ‘1 385 ‘44 
8 15 19 35 3 22 30 38 3 26 36 
9 17. 14 887d 19 26 89 CGO 
10 14 138 36 3 22 26 38 1. 28 81 
1] 1415 B20 B18 8 BH 488 
12 17 6 ~30°..1 19 Jl 32 3°) 225 
13 19.21 46 3 28 32 46> "32a 3h 
14 23 18 41 4 30° 29 46 6 34 36 
15 19 <2) (-43°- 2: 22° 330> 48) 2. "33eaesi 
16 | 18 19 48 2 26 37 54 2 40 44 
17 | 13 <1b.°39 22° 21 36 40: 291 Oe somam 
18 15 18 46 3 24 38 50 1 31 42 
19 | i> 18 42 2 i) 26) 445 fo soles 
20 16 20 44 4 23 36 50 3 32 «41 
| 
TABLE V. 
| Number identifying Total scores in the tests 
the subject 2, Ly vs wy v5 XG x7 xg vg Gp 
| 36 2670 9 47 49 92 10 54 #541 
| 2 33. 28 Yh 149 58 655). 822 el ee 
3 22 31 77 «+11 47° «#Sl 98 9 41 58 
4 44. 34 104 «150 «0557059 114 13) 60 «6 
5 35 21 72 17 #47 +46 88 11 46 58 
6 41 37 79 17 50 55 105 12 49. 61 
7 360 3838840 OT 54 BQ 9B 4 43 62 
8 31 630088 7 40 56 838 12 41 49 
9 40 23 92 16 38 44 81 10 39 54 
10 36 24 90 14 47 46 89 10 34 = 82 
11 29 19 67 18 48 40 72 9 35 49 
12 36 «18 ~«©68— O16 B86 8338 CD 9 37 43 
13 40 39 76 18 46 45 77 8 48 58 
14 37 «©6238:0—C«Ci80—id6 4 C4 9s BSC*#S‘&L' 
15 35 29 89 10 41 44 104 8 44 54 
16 38 6-80) 80 14 AT 54 8B 7 53 63 
17 30 27 «6883612 C460 S481 9 40 56 
18 38 6-28 «=o 92 12 BO 104 8 38 63 
19 34 22 91 10 387 «41 9% 8 43 54 
20 35 35 682 6160648 (62 SO101 7 40 61 


calculated, just as the correlations between mental tests are calculated. Using 
the product-moment formula we obtain the set of values in Table VI, arranged 
in hierarchical order, only slightly different from the true hierarchical order, except 
that variate a has changed its position rather violently. 


TR) 


GopFREY H. THoMsoNn 363 


TABLE VI. 
The Observed Hierarchy. 

X10 rd Xo U5 v2 XL a XL V4 Vs 
Ly e 72 47 64 53 50 34 “45 2] 09 
Ve “72 e 48 “45 79 48 32 [O(m—a 20 10 
Xg *47 “48 e ‘Ol “46 45 “50 46 —-02 24 
Xs “64 43 ‘O1 e 58 60 20 . °15 29 08 
a) D3 ‘75 “46 58 e 63 "26 33 05 -"1l 
x 50 “48 “45 60 63 e "22 "29 —°16 18 
Ly 34 “B32 D0 20 26 22 « “41 38 “15 
Xs “45 “67 “46 “15 33 "29 “41 ° —°20 08 
vy ‘21 -—°26 —-02 29 705 — "16 38 =— 20 ° — ll 
ag 09 ‘10 24 08 —--l1l 18 HHS) 08 —-l1l . 


| 


The pairs of columns which pass the Hart and Spearman correctional standard 
give the following values: 


TABLE VII. 


Columns Observed columnar True columnar ine Hart and Spearman 
passing eorealavenin alan corrected columnar 
standard bi aad correlation R’ 

2&7 0°73 0°75 0°76 

6&7 0°63 0°89 1°15 

2&3 0°70 0°60 101 

2&6 0°81 0°88 1:06 

3&6 0°66 0°83 1:04 
Means O-71 0:79 ; 1:00 
True mean columnar correlation of 

the whole table and not merely 0°59 

of the pairs of columns selected 2 

by the correctional standard 


Dr Hart and Professor Spearman would therefore claim the hierarchy as being 
a sample of a perfect’ one. The true mean columnar correlation for the whole 
table is 0°59, the Hart and Spearman correctional standard selects pairs of columns 
whose true mean columnar correlation is 0°79, and the mean value of these when 
corrected according to their formula rises to unity. This example goes far, I think; 
towards shaking confidence in their criterion. 


It must, I think, be partly chance which makes it so peculiarly unfavourable to 
their work: but I give it as it came. Really a very large number of such examples 
is necessary, and not all of these could be expected to be so unfavourable. The 
only other example which I have attempted I have carried far beyond 20 cases 

24-2 
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without as yet reaching a point where any of the columns pass the correctional 
standard. I feel that working a large number of such examples is beyond the 
power of an individual, with other claims on his time, and rather a task for a 
statistical laboratory with experienced computers and mechanical aids. 

(5) The Effect of the Correctional Standard. 

Clearly the fact that the criterion is apparently too large in a majority of cases 
requires further explanation beyond the error already pointed out of neglecting 
the terms in p’. 

The other approximations made in obtaining the criterion do not appear to be 
so erroneous as this one, though their cumulative effect may explain some 
anomalies. Leaving them on one side let us consider the * correctional standard ” 
required by Dr Hart and Professor Spearman before they admit any pair of columns. 
It is this correctional standard, combined with the peculiar distribution of F’, 
which chiefly is responsible for the exaggeration of perfection produced by this 
criterion, and for the regularity. with which an average value of unity is arrived at, 

Let us examine first the actual distribution of the Hart and Spearman R’ in a 
psychological hierarchy, viz., that of Wyatt already referred to, and calculate R’ 
not only for those columns which pass the correctional standard, but also for other 
pairs of columns. What we find is that its value rises as we descend the hierarchy, 
rushing asymptotically to infinity, remaining for a time imaginary, and then 
returning. The value reaches infinity when one of the corrections in the denomi- 
nator becomes as large as the term to be corrected, and remains imaginary until 
the other term is likewise passed by its correction, when both quantities under the 
square root are negative and an arithmetically possible but meaningless value is 
again calculable. Specimen values from Mr Wyatt’s hierarchy are given in this 


Table. 
TABLE VIII. 


Pairs of Columns Values of the Hart and Spearman R’ 
Analogies and Wordbuilding 0:93 
Completion and Wordbuilding 0-97 | Passed by the 
Completion and Part-wholes 1:05 ¢ correctional 
Wordbuilding and Part-wholes 0-99 | standard 
Part-wholes and Memory (delayed) 0-92 
Rearranged letters and Missing digits 1:17 
Wordbuilding and # R Test 1:26 
Sentence construction and Fables 1:33 
Rearranged letters and # & Test Practically infinity 
Nonsense syllables and Dissected pictures Imaginary 
Crossline test and Letter squares 0°35, both factors in the denominator 


being now negative. 
Expressed in diagrammatic form this and similar calculations lead to the 
conclusion that in actual practice the criterion is distributed as in Fig. 2, where 
the curve is to be understood as a “best fitting” curve among the values of Ff’ 
scattered, with a very considerable dispersion, on both sides of it. The line, in 
fact, ought to be a broad smudge. 


Ee 
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Now clearly, with a distribution of this sort, it is very important that the 
boundary between the values that are to be rejected and those that are to be 
accepted should be chosen with the greatest care, and not arbitrarily but scientific- 
ally. Either sound theoretical reasons should be given for the choice of the 
correctional standard, or the choice should be based empirically on experiments in 
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material where the truth is known a priori, as in the above dice experiments. For 
obviously, by moving this boundary, we can make the final average take on almost 
any value. Another point is that the criterion rushes to infinity at such speed 
that its probable error must be enormous. Dr Hart and Professor Spearman, 
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however, give no reasons for their choice of this particular standard, upon which 
depends so much the values they obtain. The standard which they thus arbitrarily 
adopt begins admitting the criteria at just such a distance above unity as to 
balance the cases which give a criterion below unity, and entirely explains the 
remarkable unanimity with which this average value unity is obtained by them in 
their calculations. 


(6) Conclusion. 


A criterion suggested by Dr Hart and Professor Spearman has been widely 
used by psychologists for the purpose of ascertaining the degree of “ hierarchical” 
order among theoretical correlation coefficients of which only experimental values 
are known, and a Theory of General Ability has been based on the results. In the 
present paper it is however shown theoretically that an assumption made in 
deducing this criterion, namely that p’ and ¢ are uncorrelated and the sums S(p’e) 
negligible, is incorrect. The quantity e taken regardless of sign is strongly corre- 
lated with p’, and its signs tend to be either all the same as, or all different from, 
those of p’. The distribution of the sums S(p’e) shows a minimum, not a maximum, 
at zero. 

Otherwise the paper is empirical, and applies the criterion in question to 
correlated dice throws. In the cases tried, this criterion exaggerates the perfection 
of the hierarchy considerably, claiming a quite poor hierarchy formed by random 
group factors as being perfect (true mean columnar correlation 0°59, the Hart and 
Spearman R’=1:00). The reason for this exaggeration, and for the unanimity 
with which in so many experiments the average value unity has been found for the 
Hart and Spearman criterion, appears to be mainly the peculiar distribution of 
this quantity, combined with the action of the “correctional standard” adopted, 
which commences admitting the criteria at such a distance above unity as to 
balance those which are less than unity. 


MISCELLANEA. 


I. Inheritance of Psychical Characters. 
By KARL PEARSON, F-.R.S. 


In view of the papers that have been published on the inheritance of intelligence, it is 
strange that there should still remain any doubt that psychical characters are inherited at the 
same rate as physical characters. But having regard to the existence of that doubt any material 
bearing on the point deserves special recognition and emphasis. 


In a recent contribution to the Journal of Delinquency, Vol. Iv. p. 46, Dr Kate Gordon gives 
the results of her tests by the Binet-Simon method of the intelligence of the children in three 
orphanages in California. Among other data she gives, almost as an aside, a small table for 
the correlation in intelligence-quotients of 91 pairs of siblings. This table appears to me 
of very considerable interest and supplies what is occasionally lacking, a nearly uniform environ- 
ment® both in training and in nourishment to the pairs dealt with. Those who dislike the 
idea that the mental as well as the physical characters are largely fixed for us by our ancestry 
are apt to attribute—regardless of known measurements of the intensity of environmental 
influence—the correlation of pairs of siblings for mental characters to a differential environment 
of the pairs, i.e. to differential family or home training. Hence the value of data obtained 
within the walls of an orphanage, as tending to minimise this differentiation. 


The Intelligence Quotient, it will be remembered, is the ratio of the mental age as given by an 
intelligence test of the Binet-Simon type to the actual age. The accompanying correlation table 
is the ‘scatter’ table of br Gordon rendered symmetrical, so that we can enter with either member 
of the pair. The probable error must, of course, be calculated for the correlation on the basis of 
91 pairs, but for the mean and standard-deviation on 182 individuals. We find : 


Mean Intelligence Quotient ae ie =92°857 +:°836, 
Variability in Intelligence, s.p. ... ee =16°727 +°591, 
Coefficient of Variation ... Be. a =18:014 +°657, 
Correlation of Intelligence between Siblings r= 5082 + ‘0524. 


At first sight it might seem as if the mean Intelligence Quotient was somewhat low. For a 
normal child it should be theoretically 100, but so much depends on the nature of the tests 
used and also on the manner in which they are applied that we cannot dogmatise on this point. 
In some recent American data we found a very low intelligence quotient among literate adults, 
and the result was clearly due to the nature and method of applying the test. The coefficient 
of variation in this case rose to the high value of 38°52, fully double the value we have found in 
other cases. We may note that the coefficient of variation is also large in the present case, 
which is distinctly against intelligence being much influenced by environmental conditions— 


* The ideal method would be to take all the siblings in a very large orphanage, such for example as 
the Reedham asylum, and select if the numbers should prove adequate only the children who had 
entered the orphanage at an early age. 
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Inheritance of Intelligence Quotient in Siblings. 


First Sibling. 


Intelligence Quotient : 
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for in this instance we have considerable approximation to uniformity of environment. For 
261 normal children examined by the Binet-Simon method by Dr Jaederholm, I find the 
coefficient of variation in intelligence as measured in mental years to be 19-476. For 420 
children in two schools [ find a coefficient of variation in general intelligence of 21:986, and for 
1725 children in eight schools I find a coefficient of variation in terms’ marks of 23°133 
These are somewhat greater than the variability obtained for the orphanage children, but do 
not show the great increase some might anticipate from variety in home and school training, 
and the increase of the last two results may be solely due to the different standards imposed by 
the judgments of a variety of teachers instead of, as in Jaederholm’s and our present cases, 
an identical series of tests made by a single psychologist. The noteworthy value, however, lies 
in the correlation of “508. The values obtained for 12 cases of physical characters in siblings 
(Biometrika, Vol. 11. p. 387) have exactly this value for their mean. No stress can, of course, be 
laid on the absolute identity considering the smallness of the present series, but much stress 
may be laid on the approximation of the two results. 


But the present data are of further interest —although they are so slender —when we compare 
the results to be obtained from them with those for a far longer series of pairs of siblings 
obtained by the method of “broad-categories.” This series is also formed from pairs of siblings 
who are children. They belonged to a great variety of schools taken throughout Great Britain. 
Every variety of environment, every variety of educational and home training is therefore 
included. Accordingly if the intellectual resemblance of siblings were the result or largely 
the result of differential treatment, we ought to anticipate a great increase of correlation in this 
material over that of the material drawn from the Californian orphanages. We have also the 
possibility of obtaining light on two further problems : 


(i) Whether the method of “broad-categories” really does give results markedly inferior 
to the Binet-Simon method of direct quantitative measurement. 


(ii) What is the approximate value of the ‘“‘mentace” or unit of intelligence in terms of a 
unit obtained from a Binet-Simon test. 


The definitions of the “ broad-categories” used by the Galton Laboratory in its intelligence 
investigations have already been published in this journal*, and a “mentace” has been defined 
as the z}9 part of the range which limits the category “ Intelligentt.” Now if we compare the 
two series, the one determined by “broad-categories ” and the other by the Binet-Simon test for 
the total frequencies up to the beginning and up to the end of the range “ Intelligent,” we shall 
have a first approximation—on the assumption that both series are measuring the same general 
intelligence character and both approximate to normal distributions—to the absolute value of a 
“mentace.” I find that my mentace is equal to 1604 of Dr Gordon’s intelligence quotient 
units, or with the average age of 10°2 (which appears to have been that of her children) it equals 
six days about of mental growth of children at this age. Roughly we might say that a mentace 
is equal to about a week’s mental growth at the age of ten years. In estimating the meaning of 
this statement we must remember that mental growth is very rapid at this age f. 


As the American data pool children of both sexes I have for purposes of comparison done the 
same. The following table represents my material for 5602 children in 2801 pairs, each pair 
being entered either way so as to produce a symmetrical table. 


* Biometrika, Vol. vu. p. 93. 


+ Biometrika, Vol. v. p. 109. 

+ The reader will of course avoid the conclusion that the mentace is an intelligence unit varying 
with age. It is the time rate of growth of intelligence which varies with age, and we must state 
a particular age in evaluating the mentace in terms of growth of intelligence. 
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Contingency Table for General Intelligence in Siblings. 


Category of Intelligence of First Sibling. 


: uick | : Slow | Slow Ver 

xe Inteligent Intethgent Intelligent | Slow | Dull Dull Totals 
8 : | 

So ~| Quick Intelligent | 31: 33° 131-75 3 66 4 767-5 
S | Intelligent 5 75 1 1927°5 
<A | Slow Intelligent 1742°5 
Bis) Slow... 773°5 
° = | Slow Dull 291 
Poo | Very Dull 100. | 
S) Totals 5602 | 


The first question that arises is that of the method to be employed in the reduction of this 
table. The answer is fairly straightforward. The only legitimate method is that of corrected 
contingency. The mean square contingency of this 6 x 6 fold table is 


2? = 293, 1833. 


Of the two methods of correcting this raw mean square contingency* for number of cells one 


leads to 


and the other to 


giving the nearly equivalent results for the contingency coefficient Cb, 
C,="4722 and C,=-4732. 
The class-index correlations are the same in both directions and we find : 
+ . Soa we 
Ve Pew =i 91738. 
Hence finally we have for the correlation 
r='5147 and r='5158, 


and accordingly it is amply adequate to take the correlation of siblings in general intelligence 
be 515, which agrees excellently with the value ‘508 found from Dr Gordon’s data. 
But as the bald figures *508 and °515 convey little to the mind untrained to statistical 


appreciations, [ have attempted to provide an illustrative diagram ; see Plate VII. 


Assuming normal distribution for the marginal totals and the arrays, I have superposed the 
means of the two systems (General Intelligence and Dr Gordon’s Stanford Revision of the 
Binet-Simon tests) and equated their s.p.’s. Using 1.9.U. for an intelligence quotient unit, Le. a 
change of one digit in the intelligence quotient (or 100 mental age/ physical age), we find : 


Mean = 578909 mentaces = 92'857 1.Q.U.’s, Standard Deviation = 95°5566 mentaces = 15°3215 1..U.’s. 
Thus a mentace='1604 1.Q.U. 


* It is hoped to publish shortly the long-delayed memoir on contingency corrections. The delay 
has largely arisen from the labour involved in reducing adequate material by way of illustration. 
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Comparison of Association in Intelligence of Pairs of Siblings as. determined by Broad Categories and Binet-Simon Tests. 
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Boundaries between Categories and Means of Categories measured from 
the Mean of Intelligence. 


Quick Intelligent and Intelligent 1045294 mentaces or 16°7602 1.Q.U.’s. 


Intelligent and Slow Intelligent 45294, 5 1262) 5 
Slow Intelligent and Slow — 77°7640 ,, »  —12°4686 _,, 
Slow and Slow Dull —141°1658 -,, » —22°6845 ,, 
Slow Dull and Very Dull —200°6975 _,, 5 ee2 1797, 
Mean of Quick Intelligent 152°967 - Fa 94°527 : 
» Intelligent 49°756 3 ae 7978 3 
» Slow Intelligent — 34°410 55 Sool - 
» Slow -105552 » 4, 16924 , 
, Slow Dull -165:590 , 4, 26551 ,, 
5  Wery Dull — 235-299 3 » 87728 3 


After careful consideration of a number of factors we divided* our “ Intelligent” category of 
100 mentaces range into “ Fair intelligence” for the first 45 mentaces and “ Capable” for the 
remaining 55 mentaces. Our “Quick Intelligent” category was again subdivided into a range 
of 200 mentaces corresponding to “Specially Able” and to “Genius” or the 14 per mille who 
exceed the mental type by more than 300 mentaces. The ‘‘ Very Dull” were again subdivided 
at 300 mentaces less than the mean and the 1:4 per mille beyond this may be looked upon as 
mentally defective. This per mille of mental defectives corresponds fairly well with the primary 
school returns. Thus the average ‘genius’ will have 312 mentaces or be almost exactly 50 
.9.U.’s above mediocrity, i.e. with a mean of 143 1.Q.uU.’s, and the average mentally defective 
312 mentaces or 50 1.Q.U.’s below the type, i.e. will have about 43 instead of 93 for intelligence 
quotient t. These limits are marked on our diagram. 


Dr Gordon’s results therefore bring out a point that was not correct in my diagram of 1906. 
The zero of intelligence is not about 300 mentaces below mediocrity, but nearer 600! Even an 
“imbecile” girl has an intelligence quotient of 29, or some 180 mentaces, where I in 1906 
assumed she should be credited with none. I still think complete imbecility should be marked 
by a total absence of mentaces or by a zero intelligence quotient. It appears better therefore to 
talk of those with intelligence less by 300 mentaces than the mean as mental defectives}. The 
problem is rather theoretical than practical, depending not so much on the existence of zero 
intelligence, as on the limen or threshold value at which we are able to realise its existence. 
Anyhow the conclusion seems to be that we must search a large number of millions if we wish 
to find an individual absolutely without intelligence. 


Examining our diagram we note how extremely closely the black points which represent the 
means of the general intelligence categories lie on their regression line. They lie so closely that 
we might almost feel disappointed that the means for the Slow Dull and Very Dull categories 
are not equally close to the regression line. But here regard must be paid to the fact that these 
are the smallest of the categories in size; and further to disturbing factors arising from the 

* See Biometrika, Vol. v. p. 110. 

+ Dr Gordon notes a very able girl with 137 1.9.0.’s and an imbecile girl with only 29 1.9.0.’s in 
a total of 335 cases. 

+ I wrote in 1906 (Biometrika, Vol. y. p. 111 ft.) that: “He [the median individual] can hardly 
have more than 350 to 400 mentaces, for at a negative position of — 350 to — 400 on the scale we have 
passed through the very dull group into imbecility and complete absence of reasoning power. The 
child whose low grade of intelligence occurs only 3 or 4 times in 100,000 cases must be sought in the 
idiot asylum.” I was probably wrong in assuming the worst type of idiot had zero intelligence. 
Dr Gordon’s mean is 6-06 times her s.p., or the absolute zero of intelligence would only oceur 1 in 


100,000,000. This is probably excessive. Dr Jaederholm’s data appear to indicate 5-5 times as the 
ratio or 1 in 12,500,000 as the occurrence, 
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probability that dull children remain longer at school than very intelligent ones, and start later. 
These factors act in a not easily interpretable way on the number of pairs of dull and very dull 
children. 

On the other hand Dr Gordon’s plotted observations show little more than the variations due 
to random sampling in such a slender series. Both sets of observations together undoubtedly 
indicate within the limits of error one and the same law of relationship*. It is almost impossible 
to conceive that such diverse environmental conditions rather than a fundamental germinal 
relation could produce such concordance, The conclusion which is emphasised by material 
drawn by such different methods from such very different environments is that the relation of 
intelligence between siblings is fixed by something more innate than environment. That some- 
thing more innate, more constant and more universal in its domination can only be the hereditary 
factor. 

Of course the results in the present paper for the relation between Intelligence Quotient and 
Mentace can only be considered as suggestions until we have far longer series of pairs of siblings 
tested by the Binet-Simon or allied methods. But they serve to indicate that very fruitful work 
can be achieved in this direction, and even the present data owing to their relatively limited 
environmental conditions may help to dispel the notion—largely based on prejudice, not on 
acquaintance with actual measurements—that differential environment is the source of resem- 
blance between siblings. 


II. Variation and Distribution of Leaves in Sassafras. 


By N. M. GRIER. 


The following note is made on the basis of examination of ten sassafras trees and 102 seedlings 
near Pittsburg, Pa., and eight trees near St Louis, Mo. Only three kinds of leaves were met 
with, three-lobed, two-lobed and single-lobed, but it may be inferred that the same laws will 
govern the distribution of the four- five- and six-lobed forms described by Berry some years ago in 
the Botanical Gazette. 

The single-lobed leaves are in great preponderance, constituting two-thirds of the foliage in 
Pittsburg specimens, while, near St Louis, three trees were observed in which other than single- 
lobed leaves were wanting. In these an extensive self-pruning had taken place. The terminal 
leaves of young branches are single-lobed, although there may be an occasional two-lobed leaf. 
Tops of trees are usually composed almost entirely of single-lobed leaves. 

The dissected forms of leaves appear to be most plentifully developed under the influence of 
shade. In such cases they were most thickly distributed at the middle of the tree (as has been 
noted for three-lobed leaves in the Britton and Brown Flora), on young twigs whose terminal 
leaves were dissected, and toward the bottom on older twigs. There was a tendency for more 
three-lobed and less one-lobed leaves to be found on smaller twigs growing near the trunk, but 
occasionally on larger twigs, or smaller boughs growing among the larger boughs. 

No transitional forms between the three-lobed and two-lobed leaves were noted on the same 
tree. The latter apparently increase in number as the three-lobed forms decrease, and are 
associated mostly with the single-lobed leaves, being about equally distributed between the 
younger and older twigs. They are rarely found ac the top of the tree. Evidence that the 
available amount of light may play some part in the distribution of leaves is found in the fact 
that the great majority of observed seedlings growing in the shade develop the three-lobed or 
two-lobed leaves in combination. Contrast is offered by a statement made in a standard 
American textbook of botany—“ In Sassafras, almost any leaf may be entire or variously lobed, 


* Both series of observations also indicated how satisfactorily the normal law of distribution may 
be applied to material of this kind. 
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apparently without relation to transpiration, nutrition, etc.” It will be observed that these 
findings substantiate in general those of Fry made in 1902*. 


Bearing in mind the foregoing statement, an attempt was made to ascertain experimentally 
the relation of amount of light as to kind of leaves developed. This year’s twigs bearing only 
one-lobed forms were tied back into shaded positions. Of ten such cases, three twigs produced 
isolated, three-lobed leaves. In another lot of the younger twigs bearing only one-lobed forms, 
the leaves were stripped from the twigs, and these too tied back in the shade. Only one twig of 
this lot responded, producing two two-lobed and one three-lobed leaf. A consistent explanation 
of this fragmentary evidence would be that the formative elements for three-lobed leaves in the 
twigs are stimulated to produce those forms. A more positive point brought out is the lack of 
proliferating power in the trees under the condition of the experiments—when compared with 
other forms possessing divided leaves as the mulberry—the majority of mutilated twigs at this- 
season, early August, not renewing their leaves. The writer is indebted for use of material to 
Mrs W. G. Gibson of Avalon, Pa., and Prof. W. J. Stevens, Field School, St Louis, Mo. 


III. Life-History Albums. 
By ETHEL M. ELDERTON. 


The Personal and Family History Register+t compiled by Dr Taylor is extremely interesting, 
and if people could be persuaded to keep the records asked for and to forward the book when 
completed to some central agency such as is intended, the statistical data then available should 
be most useful. In this register under one cover all the children of one family have their life 
histories recorded, and if the individuals are to be studied only in their childhood this is an 
advantage, but if it is hoped by means of a register to provide the life history rather than the 
child history a separate volume for each child would be preferable ; then as each child left the 

home the book could go with him to be continued and completed. Francis Galton in the Lfe- 
_ History Album issued years ago preferred this second plan and arranged that each child in the 
family should have its own album}. 


_ To the statistical worker in Eugenics so many problerns in heredity are still unsolved, 
problems dealing with fertility, with inheritance of disease, with age at death, etc. that no record 
of personal history seems adequate which does not provide the data from which such problems 
can be attacked. In the Personal and Family History Register information as to date of birth 
and date of death is sought for parents, grandparents, great-grandparents, etc. up to the sixty- 
four ancestors in the seventh generation, and such a record is interesting, but one feels that 
cause of death and some information as to general health, if obtainable, would make the data 
more useful. Further there is no space assigned for collaterals. In the introduction the 
following occurs: “It is of interest to obtain data also on collaterals (uncles, aunts, cousins, etc.) 
and alliants (members by marriage). These extras can be inscribed on a page marked ‘Special 
Happenings’ or on separate sheets or cards, and placed in the pocket at the end.” Our ex- 
perience is that even when a special space is provided for an entry the information required 
is not always given, and I think that except in a very few cases extra data of this kind will 
not be given, and I am inclined to think that knowledge of the brothers and sisters of the 
parents is of more importance for determining the hereditary characteristics of an individual 
than knowledge of the great-grandparents. Cousins, we found, were as closely related to one 
another as grandparents. to their grandchildren, and the data concerning them could be more 


* Biometrika, 1. 258, Jan. 

+ Ourselves. A Personal and Family History Register, by John Madison Taylor, A.B., M.D., 
published by F. A. Davis Company, Philadelphia, 1917. 

+ This Album is now re-issued by the Galton Laboratory through the Cambridge University Press. 
Price 9s. net. 
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easily obtained and would be more reliable than those concerning individuals who lived perhaps 
100 years ago. 

Personally I feel that careful details concerning the life history of a baby, interesting as they 
may be, are of little value to the student of Eugenics, unless the hereditary history is fully 
given. The old difficulty of deciding the relative importance of eugenics or euthenies (the word 
used by Dr Taylor to describe the science of right living) is impossible of solution if the facts 
concerning any individual are restricted entirely to one side or the other. In Dr Taylor’s 
Register the family is crowded out by the personal element. Dr Taylor fully recognizes our 
ignorance of the laws of heredity and of the question of how far ‘ pronouncedly unfavourable 
heredity ” can be influenced by euthenics, but I think he assumes that the race can be a 
through a better environment to an extent which I venture to think is unproven. 


Both the Personal and Family History Register and the Life-History Album are rather large 
volumes, somewhat alarming to the busy parent from the very size of them. But those who are 
keenly interested in the well-being of the race will be induced to keep the record ; they will be 
limited in number and may belong to a rather narrow circle, at least at the present time when 
the science of eugenics is still regarded as the fad of a few individuals. 

I believe that The Record of Family Faculties issued by Francis Galton in 1884 would prove 
far more convenient both for the recorder and for the statistical worker than either of the two 
more bulky registers, the Life-History Album and the Personal and Family History Register. 
It is thirty-five years since this volume was first published and one marvels at the genius 
of the man who then saw what data would be needed to solve the problems of the present day. 
The introduction to this book supplies in a few words the justification for requiring the data 
and indicates the reason for the questions asked. Thus : 


3. Age at marriage Total (sons No. of sons deceased Ages 
4. Age of husband No. of |daughters No. of daughters deceased Ages 


In the introduction Francis Galton writes “The ages at marriage of the two parents, the 
number and the duration of life of the children, would enable inquiry to be made into fertility 
as associated with different admixtures of race or of disease tendencies. We have yet to learn 
the conditions under which some families are prolific in their various branches, and others die 
out.” Further Question 5 is, Mode of life so far as affecting growth or health, and the justi- 
fication for asking this question is as follows: “The mode of life, so far as it affects growth 
or health, would, if known, throw light on the effect of nurture over nature. We require to 
select the families in each of which there had been a noticeable difference in the mode of life of 
two or more of its members, and to cross divide those members into two groups, in one of 
which the mode of life had been healthy, the other in which it had been the reverse. Then by 
contrasting these groups we should see the relative effects of good and bad nurture on the 
development of body and mind, and on the health, fertility, and duration of life.” 

According to the problems with which one has come in contact, each investigator would 
desire certain modifications in the questions asked, but on the whole, I believe that a collection _ 
of Records of Family Faculties would enable one to determine “ many vital questions in domestic 
economies,” and it is very desirable that this book of Galton’s should be reissued. 


IV. The Check to the Fall in the Phthisis Death-rate since the Dis- 
covery of the Tubercle Bacillus and the Adoption of Modern 


Treatment. 
By KARL PEARSON, F.R.S. 


In 1911* I pointed out that from 65 to about ’95 there was a continuous and rapid fall in 
the corrected phthisis death-rate, and also in the percentage which the deaths from phthisis 
were of all deaths. I further indicated that from 1895 onwards there had been a check to this 


* The Fight against Tuberculosis and the Death-rate from Phthisis, Cambridge University Press, 1911. 
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rapid fall and that the curves seemed to indicate that an actual rise in the phthisis death-rate 
might in the near future be reasonably anticipated. This view was rendered still more probable 
when I plotted the returns for 1910 to 1914. Since then the Great War has rendered it almost 
impossible for us so to feel our way in mortality statistics, that we can get returns comparable 
with the pre-war data. It seems to me, however, just worth while to see what our graphs 
will look like with the war years added to them. I must thank Dr Stevenson of the General 
Register Office for a renewal of his unfailing courtesy in providing me with the required data, 
and furthermore for several valuable suggestions as to the source of the remarkable results 
manifested. 


If we could trust the accompanying diagrams the anticipated rise in the phthisis death-rate 
has already occurred. But complete trust would be very much misplaced. In the first place 
our phthisis death-rate is for civilians, and since able-bodied civilians have been largely drawn 
into the army, there has naturally been a heavier death-rate of all kinds, and therefore a heavier 
phthisis death-rate than in pre-war times. There might therefore be nothing really significant 
in the marked male death-rate rise. On the other hand this explanation hardly applies to the 
rise—it is true not so marked—in the female death-rate. At the same time the whole nation, 
male and female, has been more crowded together in factories and subject to far greater strain 
than in pre-war days. This would naturally tend to emphasise the death-rate of women as well 
as of men. If we turn, however, to our second diagram we see that not only has the phthisis 
death-rate increased like the general death-rate, but it has been increasing at a more rapid rate 
than the general death-rate. This can only be accounted for on the assumption that phthisis 
more than all other diseases will be emphasised by war-strain. It can hardly be said that we 
were relieved of war-strain during 1918, indeed some of the hardest months of work and some 
of the periods of heaviest depression occurred in that year; there was further a most severe 
epidemic of influenza, and many deaths, Dr Stevenson tells me, recorded as influenza and phthisis 
were tabulated under the latter. Yet notwithstanding strain and influenza the proportion of 
phthisis deaths to deaths in general fel (see Diagram ii). 


A noteworthy feature is that the tuberculous mortality in lunatic asylums increased in an 
extraordinary manner from an average of 1800 deaths in 1912-14 to 5605 deaths in 1918. 
Dr Stevenson tells me that this will practically account for half the increase in tuberculous 
deaths for the total population in that time. Now this raises very important questions which 
ought to be answered. Were the lunatics who died of tuberculosis lunatics before the war, and 
again were they tuberculous before the war? Or did more lunatics become tuberculous owing 
to bad conditions—removal of much nursing and medical supervision—during the war? Or 
again did the tuberculous lunatics enter the asylum during the war? That is: Were the 
phthisical, simply because of their phthisis, less able to avoid mental breakdown under the 
severe war conditions? If so they would probably have died of phthisis outside the asylum 
in non-war conditions, and it would not be legitimate to cite the increased tuberculous deaths in 
asylums as something anomalous. 


On the whole it is risky to form a very definite judgment, but having regard to the female 
phthisis death-rate and to the percentage of the phthisis death-rate on the general death-rate, 
war difficulties do not seem to me sufficient to obscure the general trend of our graphs (as 
indicated before the war), namely that somewhere about 1915 the fall in the phthisis rate which 
had been less rapid since 1895 would cease altogether and probably be followed by a rise. The 
next five years will show whether this be true or not. We should expect a fall in the phthisis 
death-rate immediately, but on the average the value will remain higher than that of 1915. 
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